1977 ACM Turing Award Lecture 


The 1977 ACM Turing Award was presented to John Backus 

at the ACM Annual Conference in Seattle, October 17. In intro- 
ducing the recipient, Jean E. Sammet, Chairman of the Awards 
Committee, made the following comments and read a portion of 
the final citation. The full announcement is in the September 
1977 issue of Communications , page 681. 

“Probably there is nobody in the room who has not heard of 
Fortran and most of you have probably used it at least once, or at 
least looked over the shoulder of someone who was writing a For- 
tran program. There are probably almost as many people who 
have heard the letters BNF but don’t necessarily know what they 
stand for. Well, the B is for Backus, and the other letters are 
explained in the formal citation. These two contributions, in my 
opinion, are among the half dozen most important technical 
contributions to the computer field and both were made by John 
Backus (which in the Fortran case also involved some col- 
leagues). It is for these contributions that he is receiving this 
year’s Turing award. 

The short form of his citation is for ‘profound, influential, 
and lasting contributions to the design of practical high-level 
programming systems, notably through his work on Fortran, and 
for seminal publication of formal procedures for the specifica- 
tions of programming languages.’ 

The most significant part of the full citation is as follows: 

\ . . Backus headed a small IBM group in New York City 
during the early 1950s. The earliest product of this group’s 
efforts was a high-level language for scientific and technical com- 


putations called Fortran. This same group designed the first 
system to translate Fortran programs into machine language. 
They employed novel optimizing techniques to generate fast 
machine-language programs. Many other compilers for the lan- 
guage were developed, first on IBM machines, and later on virtu- 
ally every make of computer. Fortran was adopted as a U.S. 
national standard in 1966. 

During the latter part of the 1950s, Backus served on the 
international committees which developed Algol 58 and a later 
version, Algol 60. The language Algol, and its derivative com- 
pilers, received broad acceptance in Europe as a means for de- 
veloping programs and as a formal means of publishing the 
algorithms on which the programs are based. 

In 1959, Backus presented a paper at the UNESCO confer- 
ence in Paris on the syntax and semantics of a proposed inter- 
national algebraic language. In this paper, he was the first to 
employ a formal technique for specifying the syntax of program- 
ming languages. The formal notation became known as BNF— 
standing for “Backus Normal Form,” or “Backus Naur Form” to 
recognize the further contributions by Peter Naur of Denmark. 

Thus, Backus has contributed strongly both to the pragmatic 
world of problem-solving on computers and to the theoretical 
world existing at the interface between artificial languages and 
computational linguistics. Fortran remains one of the most 
widely used programming languages in the world. Almost all 
programming languages are now described with some type of 
formal syntactic definition.’ ” 
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Conventional programming languages are growing 
ever more enormous, but not stronger. Inherent defects 
at the most basic level cause them to be both fat and 
weak: their primitive word-at>a-time style of program- 
ming inherited from their common ancestor — the von 
Neumann computer, their close coupling of semantics to 
state transitions, their division of programming into a 
world of expressions and a world of statements, their 
inability to effectively use powerful combining forms for 
building new programs from existing ones, and their lack 
of useful mathematical properties for reasoning about 
programs. 

An alternative functional style of programming is 
founded on the use of combining forms for creating 
programs. Functional programs deal with structured 
data, are often nonrepetitive and nonrecursive, are hier- 
archically constructed, do not name their arguments, and 
do not require the complex machinery of procedure 
declarations to become generally applicable. Combining 
forms can use high level programs to build still higher 
level ones in a style not possible in conventional lan- 
guages. 
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Associated with the functional style of programming 
is an algebra of programs whose variables range over 
programs and whose operations are combining forms. 
This algebra can be used to transform programs and to 
solve equations whose “unknowns” are programs in much 
the same way one transforms equations in high school 
algebra. These transformations are given by algebraic 
laws and are carried out in the same language in which 
programs are written. Combining forms are chosen not 
only for their programming power but also for the power 
of their associated algebraic laws. General theorems of 
the algebra give the detailed behavior and termination 
conditions for large classes of programs. 

A new class of computing systems uses the functional 
programming style both in its programming language and 
in its state transition rules. Unlike von Neumann lan- 
guages, these systems have semantics loosely coupled to 
states — only one state transition occurs per major com- 
putation. 

Key Words and Phrases: functional programming, 
algebra of programs, combining forms, functional forms, 
programming languages, von Neumann computers, von 
Neumann languages, models of computing systems, ap- 
plicative computing systems, applicative state transition 
systems, program transformation, program correctness, 
program termination, metacomposition 

CR Categories: 4.20, 4.29, 5.20, 5.24, 5.26 

Introduction 

I deeply appreciate the honor of the ACM invitation 
to give the 1977 Turing Lecture and to publish this 
account of it with the details promised in the lecture. 
Readers wishing to see a summary of this paper should 
turn to Section 16, the last section. 

1. Conventional Programming Languages: Fat and 
Flabby 

Programming languages appear to be in trouble. 
Each successive language incorporates, with a little 
cleaning up, all the features of its predecessors plus a few 
more. Some languages have manuals exceeding 500 
pages; others cram a complex description into shorter 
manuals by using dense formalisms. The Department of 
Defense has current plans for a committee-designed 
language standard that could require a manual as long 
as 1,000 pages. Each new language claims new and 
fashionable features, such as strong typing or structured 
control statements, but the plain fact is that few lan- 
guages make programming sufficiently cheaper or more 
reliable to justify the cost of producing and learning to 
use them. 

Since large increases in size bring only small increases 
in power, smaller, more elegant languages such as Pascal 
continue to be popular. But there is a desperate need for 
a powerful methodology to help us think about pro- 
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grams, and no conventional language even begins to 
meet that need. In fact, conventional languages create 
unnecessary confusion in the way we think about pro- 
grams. 

For twenty years programming languages have been 
steadily progressing toward their present condition of 
obesity; as a result, the study and invention of program- 
ming languages has lost much of its excitement. Instead, 
it is now the province of those who prefer to work with 
thick compendia of details rather than wrestle with new 
ideas. Discussions about programming languages often 
resemble medieval debates about the number of angels 
that can dance on the head of a pin instead of exciting 
contests between fundamentally differing concepts. 

Many creative computer scientists have retreated 
from inventing languages to inventing tools for describ- 
ing them. Unfortunately, they have been largely content 
to apply their elegant new tools to studying the warts 
and moles of existing languages. After examining the 
appalling type structure of conventional languages, using 
the elegant tools developed by Dana Scott, it is surprising 
that so many of us remain passively content with that 
structure instead of energetically searching for new ones. 

The purpose of this article is twofold; first, to suggest 
that basic defects in the framework of conventional 
languages make their expressive weakness and their 
cancerous growth inevitable, and second, to suggest some 
alternate avenues of exploration toward the design of 
new kinds of languages. 


2. Models of Computing Systems 

Underlying every programming language is a model 
of a computing system that its programs control. Some 
models are pure abstractions, some are represented by 
hardware, and others by compiling or interpretive pro- 
grams. Before we examine conventional languages more 
closely, it is useful to make a brief survey of existing 
models as an introduction to the current universe of 
alternatives. Existing models may be crudely classified 
by the criteria outlined below. 

2.1 Criteria for Models 

2.1.1 Foundations. Is there an elegant and concise 
mathematical description of the model? Is it useful in 
proving helpful facts about the behavior of the model? 
Or is the model so complex that its description is bulky 
and of little mathematical use? 

2.1.2 History sensitivity. Does the model include a 
notion of storage, so that one program can save infor- 
mation that can affect the behavior of a later program? 
That is, is the model history sensitive? 

2.1.3 Type of semantics. Does a program successively 
transform states (which are not programs) until a termi- 
nal state is reached (state-transition semantics)? Are 
states simple or complex? Or can a “program" be suc- 
cessively reduced to simpler “programs" to yield a final 
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“normal form program,'’ which is the result (reduction 
semantics)? 

2.1.4 Clarity and conceptual usefulness of programs. 

Are programs of the model clear expressions of a process 
or computation? Do they embody concepts that help us 
to formulate and reason about processes? 

2.2 Classification of Models 

Using the above criteria we can crudely characterize 
three classes of models for computing systems — simple 
operational models, applicative models, and von Neu- 
mann models. 

2.2.1 Simple operational models. Examples: Turing 
machines, various automata. Foundations : concise and 
useful. History sensitivity : have storage, are history sen- 
sitive. Semantics', state transition with very simple states. 
Program clarity : programs unclear and conceptually not 
helpful. 

2.2.2 Applicative models. Examples: Church’s 
lambda calculus [5], Curry’s system of combinators [6], 
pure Lisp [17], functional programming systems de- 
scribed in this paper. Foundations', concise and useful. 
History sensitivity : no storage, not history sensitive. Se- 
mantics: reduction semantics, no states. Program clarity : 
programs can be clear and conceptually useful. 

2.2.3 Von Neumann models. Examples: von Neu- 
mann computers, conventional programming languages. 
Foundations: complex, bulky, not useful. History sensitiv- 
ity: have storage, are history sensitive. Semantics: state 
transition with complex states. Program clarity: programs 
can be moderately clear, are not very useful conceptually. 

The above classification is admittedly crude and 
debatable. Some recent models may not fit easily into 
any of these categories. For example, the data-flow 
languages developed by Arvind and Gostelow [1], Den- 
nis [7], Kosinski [13], and others partly fit the class of 
simple operational models, but their programs are clearer 
than those of earlier models in the class and it is perhaps 
possible to argue that some have reduction semantics. In 
any event, this classification will serve as a crude map of 
the territory to be discussed. We shall be concerned only 
with applicative and von Neumann models. 


3. Von Neumann Computers 

In order to understand the problems of conventional 
programming languages, we must first examine their 
intellectual parent, the von Neumann computer. What is 
a von Neumann computer? When von Neumann and 
others conceived it over thirty years ago, it was an 
elegant, practical, and unifying idea that simplified a 
number of engineering and programming problems that 
existed then. Although the conditions that produced its 
architecture have changed radically, we nevertheless still 
identify the notion of “computer” with this thirty year 
old concept. 

In its simplest form a von Neumann computer has 
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three parts: a central processing unit (or CPU), a store, 
and a connecting tube that can transmit a single word 
between the CPU and the store (and send an address to 
the store). I propose to call this tube the von Neumann 
bottleneck. The task of a program is to change the 
contents of the store in some major way; when one 
considers that this task must be accomplished entirely by 
pumping single words back and forth through the von 
Neumann bottleneck, the reason for its name becomes 
clear. 

Ironically, a large part of the traffic in the bottleneck 
is not useful data but merely names of data, as well as 
operations and data used only to compute such names. 
Before a word can be sent through the tube its address 
must be in the CPU; hence it must either be sent through 
the tube from the store or be generated by some CPU 
operation. If the address is sent from the store, then its 
address must either have been sent from the store or 
generated in the CPU, and so on. If, on the other hand, 
the address is generated in the CPU, it must be generated 
either by a fixed rule (e.g., “add 1 to the program 
counter”) or by an instruction that was sent through the 
tube, in which case its address must have been sent . . . 
and so on. 

Surely there must be a less primitive way of making 
big changes in the store than by pushing vast numbers 
of words back and forth through the von Neumann 
bottleneck. Not only is this tube a literal bottleneck for 
the data traffic of a problem, but, more importantly, it is 
an intellectual bottleneck that has kept us tied to word- 
at-a-time thinking instead of encouraging us to think in 
terms of the larger conceptual units of the task at hand. 
Thus programming is basically planning and detailing 
the enormous traffic of words through the von Neumann 
bottleneck, and much of that traffic concerns not signif- 
icant data itself but where to find it. 


4. Von Neumann Languages 

Conventional programming languages are basically 
high level, complex versions of the von Neumann com- 
puter. Our thirty year old belief that there is only one 
kind of computer is the basis of our belief that there is 
only one kind of programming language, the conven- 
tional — von Neumann — language. The differences be- 
tween Fortran and Algol 68, although considerable, are 
less significant than the fact that both are based on the 
programming style of the von Neumann computer. Al- 
though I refer to conventional languages as “von Neu- 
mann languages” to take note of their origin and style, 
I do not, of course, blame the great mathematician for 
their complexity. In fact, some might say that I bear 
some responsibility for that problem. 

Von Neumann programming languages use variables 
to imitate the computer’s storage cells; control statements 
elaborate its jump and test instructions; and assignment 
statements imitate its fetching, storing, and arithmetic. 
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The assignment statement is the von Neumann bottle- 
neck of programming languages and keeps us thinking 
in word-at-a-time terms in much the same way the 
computer’s bottleneck does. 

Consider a typical program; at its center are a number 
of assignment statements containing some subscripted 
variables. Each assignment statement produces a one- 
word result. The program must cause these statements to 
be executed many times, while altering subscript values, 
in order to make the desired overall change in the store, 
since it must be done one word at a time. The program- 
mer is thus concerned with the flow of words through 
the assignment bottleneck as he designs the nest of 
control statements to cause the necessary repetitions. 

Moreover, the assignment statement splits program- 
ming into two worlds. The first world comprises the right 
sides of assignment statements. This is an orderly world 
of expressions, a world that has useful algebraic proper- 
ties (except that those properties are often destroyed by 
side effects). It is the world in which most useful com- 
putation takes place. 

The second world of conventional programming lan- 
guages is the world of statements. The primary statement 
in that world is the assignment statement itself. All the 
other statements of the language exist in order to make 
it possible to perform a computation that must be based 
on this primitive construct: the assignment statement. 

This world of statements is a disorderly one, with few 
useful mathematical properties. Structured programming 
can be seen as a modest effort to introduce some order 
into this chaotic world, but it accomplishes little in 
attacking the fundamental problems created by the 
word-at-a-time von Neumann style of programming, 
with its primitive use of loops, subscripts, and branching 
flow of control. 

Our fixation on von Neumann languages has contin- 
ued the primacy of the von Neumann computer, and our 
dependency on it has made non-von Neumann languages 
uneconomical and has limited their development. The 
absence of full scale, effective programming styles 
founded on non-von Neumann principles has deprived 
designers of an intellectual foundation for new computer 
architectures. (For a brief discussion of that topic, see 
Section 15.) 

Applicative computing systems’ lack of storage and 
history sensitivity is the basic reason they have not 
provided a foundation for computer design. Moreover, 
most applicative systems employ the substitution opera- 
tion of the lambda calculus as their basic operation. This 
operation is one of virtually unlimited power, but its 
complete and efficient realization presents great difficul- 
ties to the machine designer. Furthermore, in an effort 
to introduce storage and to improve their efficiency on 
von Neumann computers, applicative systems have 
tended to become engulfed in a large von Neumann 
system. For example, pure Lisp is often buried in large 
extensions with many von Neumann features. The re- 
sulting complex systems offer little guidance to the ma- 
chine designer. 
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5. Comparison of von Neumann and Functional 
Programs 

To get a more detailed picture of some of the defects 
of von Neumann languages, let us compare a conven- 
tional program for inner product with a functional one 
written in a simple language to be detailed further on. 

5.1 A von Neumann Program for Inner Product 

c r= 0 

for i “ 1 step 1 until n do 
c ?= c + a[i]xb[i] 

Several properties of this program are worth noting: 

a) Its statements operate on an invisible "state” ac- 
cording to complex rules. 

b) It is not hierarchical. Except for the right side of 
the assignment statement, it does not construct complex 
entities from simpler ones. (Larger programs, however, 
often do.) 

c) It is dynamic and repetitive. One must mentally 
execute it to understand it. 

d) It computes word-at-a-time by repetition (of the 
assignment) and by modification (of variable i). 

e) Part of the data, n, is in the program; thus it lacks 
generality and works only for vectors of length n. 

f) It names its arguments; it can only be used for 
vectors a and b. To become general, it requires a proce- 
dure declaration. These involve complex issues (e.g., call- 
by-name versus call-by-value). 

g) Its "housekeeping” operations are represented by 
symbols in scattered places (in the for statement and the 
subscripts in the assignment). This makes it impossible 
to consolidate housekeeping operations, the most com- 
mon of all, into single, powerful, widely useful operators. 
Thus in programming those operations one must always 
start again at square one, writing "for i . .” and 
"for j := . . followed by assignment statements sprin- 
kled with i’s and j’s. 

5.2 A Functional Program for Inner Product 
Def Innerproduct 

= (Insert +)°(ApplyToAll X)°Transpose 

Or, in abbreviated form: 

Def IP = (/+) o (aX)oTrans. 

Composition (°), Insert (/), and ApplyToAU («) are 
functional forms that combine existing functions to form 
new ones. Thus f°g is the function obtained by applying 
first g and then f and of is the function obtained bv 
applying/ to every member of the argument. If we w rite 
f:x for the result of applying/ to the object x, then we 
can explain each step in evaluating Innerproduct applied 
to the pair of vectors «1, 2, 3>, <6, 5, 4» as follows: 

IP:«1,2,3>, <6,5,4» = 

Definition of IP =* (/+)°(aX)<>Trans: «l.2.3>. <fo.4>> 

Effect of composition, ° => (/+):((aX):(Trans: 

«I.2,3>. <ro -4 -» > i ) 
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Applying Transpose 
Effect of ApplyToAll, a 
Applying x 
Effect of Insert, / 
Applying + 

Applying + again 


=> (/+):((aX): «1,6>, <2,5>, <3,4») 
=> (/+): <x: <1,6>, x: <2,5>, X: <3.4» 
=>(/+): < 6 , 10 , 12 > 

=> +: < 6 , +: < 10 , 12 » 

=> +: < 6 , 22 > 

=>28 


Let us compare the properties of this program with 
those of the von Neumann program. 

a) It operates only on its arguments. There are no 
hidden states or complex transition rules. There are only 
two kinds of rules, one for applying a function to its 
argument, the other for obtaining the function denoted 
by a functional form such as composition, f°g, or 
ApplyToAll, af when one knows the functions / and g, 
the parameters of the forms. 

b) It is hierarchical, being built from three simpler 
functions (+, X, Trans) and three functional forms f°g, 
af and //. 

c) It' is static and nonrepetitive, in the sense that its 
structure is helpful in understanding it without mentally 
executing it. For example, if one understands the action 
of the forms f°g and af and of the functions X and 
Trans, then one understands the action of ax and of 
(aX)°Trans, and so on. 

d) It operates on whole conceptual units, not words; 
it has three steps; no step is repeated. 

e) It incorporates no data; it is completely general; it 
works for any pair of conformable vectors. 

f) It does not name its arguments; it can be applied to 
any pair of vectors without any procedure declaration or 
complex substitution rules. 

g) It employs housekeeping forms and functions that 
are generally useful in many other programs; in fact, 
only + and X are not concerned with housekeeping. 
These forms and functions can combine with others to 
create higher level housekeeping operators. 

Section 14 sketches a kind of system designed to 
make the above functional style of programming avail- 
able in a history-sensitive system with a simple frame- 
work, but much work remains to be done before the 
above applicative style can become the basis for elegant 
and practical programming languages. For the present, 
the above comparison exhibits a number of serious flaws 
in von Neumann programming languages and can serve 
as a starting point in an effort to account for their present 
fat and flabby condition. 


6. Language Frameworks versus Changeable Parts 

Let us distinguish two parts of a programming lan- 
guage. First, its framework which gives the overall rules 
of the system, and second, its changeable parts , whose 
existence is anticipated by the framework but whose 
particular behavior is not specified by it. For example, 
the for statement, and almost all other statements, are 
part of Algol’s framework but library functions and user- 
defined procedures are changeable parts. Thus the 
framework of a language describes its fixed features and 


provides a general environment for its changeable fea- 
tures. 

Now suppose a language had a small framework 
which could accommodate a great variety of powerful 
features entirely as changeable parts. Then such a frame- 
work could support many different features and styles 
without being changed itself. In contrast to this pleasant 
possibility, von Neumann languages always seem to have 
an immense framework and very limited changeable 
parts. What causes this to happen? The answer concerns 
two problems of von Neumann languages. 

The first problem results from the von Neumann 
style of word-at-a-time programming, which requires 
that words flow back and forth to the state, just like the 
flow through the von Neumann bottleneck. Thus a von 
Neumann language must have a semantics closely cou- 
pled to the state, in which every detail of a computation 
changes the state. The consequence of this semantics 
closely coupled to states is that every detail of every 
feature must be built into the state and its transition 
rules. 

Thus every feature of a von Neumann language must 
be spelled out in stupefying detail in its framework. 
Furthermore, many complex features are needed to prop 
up the basically weak word-at-a-time style. The result is 
the inevitable rigid and enormous framework of a von 
Neumann language. 


7. Changeable Parts and Combining Forms 

The second problem of von Neumann languages is 
that their changeable parts have so little expressive 
power. Their gargantuan size is eloquent proof of this; 
after all, if the designer knew that all those complicated 
features, which he now builds into the framework, could 
be added later on as changeable parts, he would not be 
so eager to build them into the framework. 

Perhaps the most important element in providing 
powerful changeable parts in a language is the availabil- 
ity of combining forms that can be generally used to 
build new procedures from old ones. Von Neumann 
languages provide only primitive combining forms, and 
the von Neumann framework presents obstacles to their 
full use. 

One obstacle to the use of combining forms is the 
split between the expression world and the statement 
world in von Neumann languages. Functional forms 
naturally belong to the world of expressions; but no 
matter how powerful they are they can only build expres- 
sions that produce a one-word result. And it is in the 
statement world that these one-word results must be 
combined into the overall result. Combining single words 
is not what we really should be thinking about, but it is 
a large part of programming any task in von Neumann 
languages. To help assemble the overall result from 
single words these languages provide some primitive 
combining forms in the statement world— the for, while, 
and if-then-else statements — but the split between the 
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two worlds prevents the combining forms in either world 
from attaining the full power they can achieve in an 
undivided world. 

A second obstacle to the use of combining forms in 
von Neumann languages is their use of elaborate naming 
conventions, which are further complicated by the sub- 
stitution rules required in calling procedures. Each of 
these requires a complex mechanism to be built into the 
framework so that variables, subscripted variables, 
pointers, file names, procedure names, call-by-value for- 
mal parameters, call-by-name formal parameters, and so 
on, can all be properly interpreted. All these names, 
conventions, and rules interfere with the use of simple 
combining forms. 

8. APL versus Word-at-a-Time Programming 

Since I have said so much about word-at-a-time 
programming, I must now say something about APL 
[12]. We owe a great debt to Kenneth Iverson for showing 
us that there are programs that are neither word-at-a- 
time nor dependent on lambda expressions, and for 
introducing us to the use of new functional forms. And 
since APL assignment statements can store arrays, the 
effect of its functional forms is extended beyond a single 
assignment. 

Unfortunately, however, APL still splits program- 
ming into a world of expressions and a world of state- 
ments. Thus the effort to write one-line programs is 
partly motivated by the desire to stay in the more orderly 
world of expressions. APL has exactly three functional 
forms, called inner product, outer product, and reduc- 
tion. These are sometimes difficult to use, there are not 
enough of them, and their use is confined to the world 
of expressions. 

Finally, APL semantics is still too closely coupled to 
states. Consequently, despite the greater simplicity and 
power of the language, its framework has the complexity 
and rigidity characteristic of von Neumann languages. 

9. Von Neumann Languages Lack Useful 
Mathematical Properties 

So far we have discussed the gross size and inflexi- 
bility of von Neumann languages; another important 
defect is their lack of useful mathematical properties and 
the obstacles they present to reasoning about programs. 
Although a great amount of excellent work has been 
published on proving facts about programs, von Neu- 
mann languages have almost no properties that are 
helpful in this direction and have many properties that 
are obstacles (e.g., side effects, aliasing). 

Denotational semantics [23] and its foundations [20, 
21] provide an extremely helpful mathematical under- 
standing of the domain and function spaces implicit in 
programs. When applied to an applicative language 
(such as that of the “recursive programs” of [16]), its 
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foundations provide powerful tools for describing the 
language and for proving properties of programs. When 
applied to a von Neumann language, on the other hand, 
it provides a precise semantic description and is helpful 
in identifying trouble spots in the language. But the 
complexity of the language is mirrored in the complexity 
of the description, which is a bewildering collection of 
productions, domains, functions, and equations that is 
only slightly more helpful in proving facts about pro- 
grams than the reference manual of the language, since 
it is less ambiguous. 

Axiomatic semantics [11] precisely restates the in- 
elegant properties of von Neumann programs (i.e., trans- 
formations on states) as transformations on predicates. 
The word-at-a-time, repetitive game is not thereby 
changed, merely the playing field. The complexity of this 
axiomatic game of proving facts about von Neumann 
programs makes the successes of its practitioners all the 
more admirable. Their success rests on two factors in 
addition to their ingenuity: First, the game is restricted 
to small, weak subsets of full von Neumann languages 
that have states vastly simpler than real ones. Second, 
the new playing field (predicates and their transforma- 
tions) is richer, more orderly and effective than the old 
(states and their transformations). But restricting the 
game and transferring it to a more effective domain does 
not enable it to handle real programs (with the necessary 
complexities of procedure calls and aliasing), nor does it 
eliminate the clumsy properties of the basic von Neu- 
mann style. As axiomatic semantics is extended to cover 
more of a typical von Neumann language, it begins to 
lose its effectiveness with the increasing complexity that 
is required. 

Thus denotational and axiomatic semantics are de- 
scriptive formalisms whose foundations embody elegant 
and powerful concepts; but using them to describe a von 
Neumann language can not produce an elegant and 
powerful language any more than the use of elegant and 
modem machines to build an Edsel can produce an 
elegant and modem car. 

In any case, proofs about programs use the language 
of logic, not the language of programming. Proofs talk 
about programs but cannot involve them directly since 
the axioms of von Neumann languages are so unusable. 
In contrast, many ordinary proofs are derived by alge- 
braic methods. These methods require a language that 
has certain algebraic properties. Algebraic laws can then 
be used in a rather mechanical way to transform a 
problem into its solution. For example, to solve the 
equation 

ax + bx = a + b 

for x (given that a+b # 0), we mechanically apply the 
.distributive, identity, and cancellation laws, in succes- 
sion, to obtain 

(a + b)x = a + b 
(a + b)x = (a + b)l 
x = 1. 
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Thus we have proved that x = 1 without leaving the 
'language'’ of algebra. Von Neumann languages, with 
their grotesque syntax, offer few such possibilities for 
transforming programs. 

As we shall see later, programs can be expressed in 
a language that has an associated algebra. This algebra 
can be used to transform programs and to solve some 
equations whose “unknowns” are programs, in much the 
same way one solves equations in high school algebra. 
Algebraic transformations and proofs use the language 
of the programs themselves, rather than the language of 
logic, which talks about programs. 

10. What Are the Alternatives to von Neumann 
Languages? 

Before discussing alternatives to von Neumann lan- 
guages, let me remark that I regret the need for the above 
negative and not very precise discussion of these lan- 
guages. But the complacent acceptance most of us give 
to these enormous, weak languages has puzzled and 
disturbed me for a long time. I am disturbed because 
that acceptance has consumed a vast effort toward mak- 
ing von Neumann languages fatter that might have been 
better spent in looking for new structures. For this reason 
I have tried to analyze some of the basic defects of 
conventional languages and show that those defects can- 
not be resolved unless we discover a new kind of lan- 
guage framework. 

In seeking an alternative to conventional languages 
we must first recognize that a system cannot be history 
sensitive (permit execution of one program to affect the 
behavior of a subsequent one) unless the system has 
some kind of state (which the first program can change 
and the second can access). Thus a history-sensitive 
model of a computing system must have a state-transition 
semantics, at least in this weak sense. But this does not 
mean that every computation must depend heavily on a 
complex state, with many state changes required for each 
small part of the computation (as in von Neumann 
languages). 

To illustrate some alternatives to von Neumann lan- 
guages, I propose to sketch a class of history-sensitive 
computing systems, where each system: a) has a loosely 
coupled state-transition semantics in which a state tran- 
sition occurs only once in a major computation; b) has 
a simply structured state and simple transition rules; c) 
depends heavily on an underlying applicative system 
both to provide the basic programming language of the 
system and to describe its state transitions. 

These systems, which I call applicative state transition 
(or AST) systems, are described in Section 14. These 
simple systems avoid many of the complexities and 
weaknesses of von Neumann languages and provide for 
a powerful and extensive set of changeable parts. How- 
ever, they are sketched only as crude examples of a vast 
area of non-von Neumann systems with various attrac- 
tive properties. I have been studying this area for the 
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past three or four years and have not yet found a 
satisfying solution to the many conflicting requirements 
that a good language must resolve. But I believe this 
search has indicated a useful approach to designing non- 
von Neumann languages. 

This approach involves four elements, which can be 
summarized as follows. 

a) A functional style of programming without varia- 
bles. A simple, informal functional programming (FP) 
system is described. It is based on the use of combining 
forms for building programs. Several programs are given 
to illustrate functional programming. 

b) An algebra of functional programs. An algebra is 
described whose variables denote FP functional pro- 
grams and whose “operations” are FP functional forms, 
the combining forms of FP programs. Some laws of the 
algebra are given. Theorems and examples are given that 
show how certain function expressions may be trans- 
formed into equivalent infinite expansions that explain 
the behavior of the function. The FP algebra is compared 
with algebras associated with the classical applicative 
systems of Church and Curry. 

c) A formal functional programming system. A formal 
(FFP) system is described that extends the capabilities 
of the above informal FP systems. An FFP system is 
thus a precisely defined system that provides the ability 
to use the functional programming style of FP systems 
and their algebra of programs. FFP systems can be used 
as the basis for applicative state transition systems. 

d) Applicative state transition systems. As discussed 
above. The rest of the paper describes these four ele- 
ments, gives some brief remarks on computer design, 
and ends with a summary of the paper. 

11. Functional Programming Systems (FP Systems) 

11.1 Introduction 

In this section we give an informal description of a 
class of simple applicative programming systems called 
functional programming (FP) systems, in which "pro- 
grams” are simply functions without variables. The de- 
scription is followed by some examples and by a discus- 
sion of various properties of FP systems. 

An FP system is founded on the use of a fixed set of 
combining forms called functional forms. These, plus 
simple definitions, are the only means of building new 
functions from existing ones; they use no variables or 
substitution rules, and they become the operations of an 
associated algebra of programs. All the functions of an 
FP system are of one type: they map objects into objects 
and always take a single argument. 

In contrast, a lambda-calculus based system is 
founded on the use of the lambda expression, with an 
associated set of substitution rules for variables, for 
building new functions. The lambda expression (with its 
substitution rules) is capable of defining all possible 
computable functions of all possible types and of any 
number of arguments. This freedom and power has its 
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disadvantages as well as its obvious advantages. It is 
analogous to the power of unrestricted control statements 
in conventional languages: with unrestricted freedom 
comes chaos. If one constantly invents new combining 
forms to suit the occasion, as one can in the lambda 
calculus, one will not become familiar with the style or 
useful properties of the few combining forms that are 
adequate for all purposes. Just as structured program- 
ming eschews many control statements to obtain pro- 
grams with simpler structure, better properties, and uni- 
form methods for understanding their behavior, so func- 
tional programming eschews the lambda expression, sub- 
stitution, and multiple function types. It thereby achieves 
programs built with familiar functional forms with 
known useful properties. These programs are so struc- 
tured that their behavior can often be understood and 
proven by mechanical use of algebraic techniques similar 
to those used in solving high school algebra problems. 

Functional forms, unlike most programming con- 
structs, need not be chosen on an ad hoc basis. Since 
they are the operations of an associated algebra, one 
chooses only those functional forms that not only provide 
powerful programming constructs, but that also have 
attractive algebraic properties: one chooses them to max- 
imize the strength and utility of the algebraic laws that 
relate them to other functional forms of the system. 

In the following description we shall be imprecise in 
not distinguishing between (a) a function symbol or 
expression and (b) the function it denotes. We shall 
indicate the symbols and expressions used to denote 
functions by example and usage. Section 13 describes a 
formal extension of FP systems (FFP systems); they can 
serve to clarify any ambiguities about FP systems. 

11.2 Description 

An FP system comprises the following: 

1) a set O of objects', 

2) a set F of functions f that map objects into objects; 

3) an operation, application ; 

4) a set F of functional forms . ; these are used to combine 
existing functions, or objects, to form new functions in 
F; 

5) a set D of definitions that define some functions in F 
and assign a name to each. 

What follows is an informal description of each of 
the above entities with examples. 

11.2.1 Objects, O. An object x is either an atom , a 
sequence <x u ... , x n > whose elements x\ are objects, or 
JL (“bottom” or “undefined”). Thus the choice of a set A 
of atoms determines the set of objects. We shall take A 
to be the set of nonnull strings of capital letters, digits, 
and special symbols not used by the notation of the FP 
system. Some of these strings belong to the class of atoms 
called “numbers.” The atom </> is used to denote the 
empty sequence and is the only object which is both an 
atom and a sequence. The atoms T and F are used to 
denote “true” and “false.” 
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There is one important constraint in the construction 
of objects: if x is a sequence with ± as an element, then 
x = ±. That is, the “sequence constructor” is “J_ -pre- 
serving.” Thus no proper sequence has ± as an element. 

Examples of objects 

1 1.5 $ AB3 <AB, /, 2.3> 

<A, «B>, O, D> <A, _L> = _L 

11.2.2 Application. An FP system has a single oper- 
ation, application. If / is a function and x is an object, 
then f:x is an application and denotes the object which 
is the result of applying f to x. f is the operator of the 
application and x is the operand. 

Examples of applications 

+:</,2> = 3 t\.<A,B,C> = <B,C> 

1 :<A,B,C> = A 2 :<A,B,C> = B 

11.2.3 Functions, F. All functions/ in F map objects 
into objects and are bottom-preserving : /:_ L = _L, for all / 
in F. Every function in F is either primitive , that is, 
supplied with the system, or it is defined (see below), or 
it is a functional form (see below). 

It is sometimes useful to distinguish between two 
cases in which f:x=±. If the computation for f.x termi- 
nates and yields the object _L, we say /is undefined at x, 
that is, / terminates but has no meaningful value at .x:. 
Otherwise we say / is nonterminating at x. 

Examples of primitive functions 

Our intention is to provide FP systems with widely 
useful and powerful primitive functions rather than weak 
ones that could then be used to define useful ones. The 
following examples define some typical primitive func- 
tions, many of which are used in later examples of 
programs. In the following definitions we use a variant 
of McCarthy’s conditional expressions [17]; thus we write 

P 1 * ••• » Pn * ^n+1 

instead of McCarthy’s expression 

(.P 1 * ^l» .*• * Pn ‘ * ^n> T * ^n+l). 

The following definitions are to hold for all objects r, 

y, yu z, z x \ 

Selector functions 

1 :x = X=<X\, ... , Xn> — > Xu -L 

and for any positive integer s 

s:x = x = <xi, ... , jc n > & n > s — ► x 8 ; _L 

Thus, for example, 3 :<A,B,C> = C and 2 :<A> = ±. 
Note that the function symbols 1, 2, etc. are distinct from 
the atoms /, 2, etc. 

Tail 

tl : jc = x=<jci> — > (j>; 

' *=<*1, ... , *n> & n >2 -► <*2 *„>; l 

Identity 

id:x = x 
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Atom 

atom:* = x is an atom — > T; x^± —> F; ! 

Equals 

eq:x = x=<y,z> &y=z —> T\ x=<y,z> & y^z — ► F\ ± 

Null 

null:* = x=<j> — > T\ x^l —> F; 1 

Reverse 

reverse:* = x=<f> — > <j>; 

*=<* 1 , ... , * n > — > <*n, ... , *1>; -L 

Distribute from left; distribute from right 

distl:* = x=<y,(j>> — > (j > ; 

x=<y,<zu ... , Z n >> — > «y,Zi> <y,z n » ; ! 

distr:* = x=<fay> — » fa 

x=«y u ... ,/„>,z> — ► <<yi,z>, ... , <y n ,z»; J_ 

Length 

length:* = *=<*i, ... , * n > — > n; *=<> — » 0; ! 

Add, subtract, multiply, and divide 

+ :* = x=<y,z> & y,z are numbers — * y+z; ! 

— :* = x=<y,z> & y,z are numbers — » z; ! 

X :* = x=<y,z> & y,z are numbers — » ^Xz; ! 

-5- :* = x=</,z> & /,z are numbers — > y+z\ ! 

(where y+0 = !) 

Transpose 

trans:* = x=<fa ... , <j>> —> fa 

*=<* 1 , ... , * n > -» Cyi y m > ; ! 

where 

ATi=<*Cii, ... , * im > and 

/j=<*ij, ... , * nj >, l<i<n, l<j<m. 

And, or, not 

and:* = x=<T,T> — > T\ 

x=<T,F> v x=<F,T> v x=<F,F> — ► F; _L 
etc. 

Append left; append right 

apndl:* = x=<y,fa> — > <y>; 

*=<#<*i, ... , Z n » -► < 7 ,z,, ... , z n >; 1 
apndr:* = x=<faz> — > <z>; 

, y n >,z> </i 7 n,z>; ! 

Right selectors; Right tail 

lr:* = *=<*i, ... , * n > -► * n ; JL 

2r:* = *=<*i, ... , * n > & n>2 — » * n -i; 1 

etc. 

tlr:*= *=<*!> — » fa 

*=<*i, ... , * n > & n>2 — > <*i * n -i>; 1 

Rotate left; rotate right 

rotl:* = x=<j> —> fa *=<*i> — > <*i>; 

*=<* 1 , ... , * n > & n>2 -> <X 2 , ... , *n,*l>; ! 
etc. 

11.2.4 Functional forms, F. A functional form is an 
expression denoting a function; that function depends on 
the functions or objects which are the parameters of the 
expression. Thus, for example, if / and g are any func- 
tions, then fog is a functional form, the composition of / 
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and g, / and g are its parameters, and it denotes the 
function such that, for any object *, 

(f°g):x =f:(g:x). 

Some functional forms may have objects as parameters. 
For example, for any object *, * is a functional form, the 
constant function of *, so that for any object y 

x:y =y= JL — ► 1; *. 

In particular, i is the everywhere-! function. 

Below we give some functional forms, many of which 
are used later in this paper. We use /?,/, and g with and 
without subscripts to denote arbitrary functions; and *, 
*i, ... , * n , y as arbitrary objects. Square brackets [...] are 
used to indicate the functional form for construction , 
which denotes a function, whereas pointed brackets 
<...> denote sequences, which are objects. Parentheses 
are used both in particular functional forms (e.g., in 
condition) and generally to indicate grouping. 

Composition 

( f°g)-x=f:(g:x ) 

Construction 

Ifu ... ,/n]:* = </i:*, ... ,/ n :*> (Recall that since 
<... , !, ...> = ! and all functions are ± -preserving, so 
is [/,, ... ,/„].) 

Condition 

(p^figY-x = (p:x)=T->f :x; (p:x)=F~* g:x: _L 

Conditional expressions (used outside of FP systems to 
describe their functions) and the functional form condi- 
tion are both identified by They are quite different 
although closely related, as shown in the above defini- 
tions. But no confusion should arise, since the elements 
of a conditional expression all denote values, whereas 
the elements of the functional form condition all denote 
functions, never values. When no ambiguity arises we 
omit right-associated parentheses; we write, for example. 
Pi ->fu P 2 — / 2 ; g for (p t —*fu ip 2 -*fia g)). 

Constant (Here * is an object parameter.) 
x.y = y=± — ► 1; x 

Insert 

/f:x = x=<Xi> —> Xu x=<xu ... , JCn > & n>2 

-»/:<•* i, /f-<Xi x„»; _L 

If / has a unique right unit u f ^ 1. where 
f:<x,\if> G {*, 1} for all objects jc, then the above 
definition is extended: //:<p = \i f . Thus 

/+:<4,5,6> = +:<4, +:<5, /+:<6»> 

= +:<4, + :<5,6» = 15 

/+:< f >=0 

Apply to all 

af :x = x=<f> — » <f>; 

x=<xu ... , x n > ^ </:x, f .x„>\ L 
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Binary to unary (x is an object parameter) 

(bu fx):y~f<x,y> 

Thus 

(bu + l):x = 7+x 

While 

(while p f):x = p\x—T — > (while pf):(f:x); 

p:x=F ^ x\ _L 

The above functional forms provide an effective 
method for computing the values of the functions they 
denote (if they terminate) provided one can effectively 
apply their function parameters. 

11.2.5 Definitions. A definition in an FP system is an 
expression of the form 

Def / = r 

where the left side / is an unused function symbol and 
the right side r is a functional form (which may depend 
on /). It expresses the fact that the symbol / is to denote 
the function given by r. Thus the definition Def last 1 = 
lore verse defines the function lastl that produces the 
last element of a sequence (or _L). Similarly, 

Def last = null°tl — » 1; last°tl 

defines the function last, which is the same as lastl. Here 
in detail is how the definition would be used to compute 
last:<7,2>: 

last :</,2> = 

definition of last ==> (null°tl — ► 1; last°tl):</,2> 

action of the form g ) => last°tl:</,2> 

since null°tl:<7,2> = null:<2> 
= F 

action of the form f°g => last:(tl:</,2>) 

definition of primitive tail =* last:<2> 

definition of last => (null°tl — » 1; last°tl):<2> 

action of the form (/>—►/; g ) => 1 :<2> 

since null°tl:<2> = null :<> = T 
definition of selector I => 2 

The above illustrates the simple rule: to apply a 
defined symbol, replace it by the right side of its defini- 
tion. Of course, some definitions may define nontermi- 
nating functions. A set D of definitions is well formed if 
no two left sides are the same. 

11.2.6 Semantics. It can be seen from the above that 
an FP system is determined by choice of the following 
sets: (a) The set of atoms A (which determines the set of 
objects), (b) The set of primitive functions P. (c) The set 
of functional forms F. (d) A well formed set of definitions 
D. To understand the semantics of such a system one 
needs to know how to compute fx for any function / 
and any object x of the system. There are exactly four 
possibilities for /: 

(1) /is a primitive function; 

(2) / is a functional form; 

(3) there is one definition in D, Def /= r; and 

(4) none of the above. 

If / is a primitive function, then one has its description 
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and knows how to apply it. If/ is a functional form, then 
the description of the form tells how to compute fx in 
terms of the parameters of the form, which can be done 
by further use of these rules. If / is defined, Def f = r, as 
in (3), then to find fx one computes r:x , which can be 
done by further use of these rules. If none of these, then 
fx = 1. Of course, the use of these rules may not 
terminate for some / and some x, in which case we assign 
the value fx = 1. 

11.3 Examples of Functional Programs 

The following examples illustrate the functional pro- 
gramming style. Since this style is unfamiliar to most 
readers, it may cause confusion at first; the important 
point to remember is that no part of a function definition 
is a result itself. Instead, each part is a function that must 
be applied to an argument to obtain a result. 

11.3.1 Factorial. 

Def ! = eqO— > I; x°[id, !°subl] 

where 

Def eqO = eq°[id, 6] 

Def subl = — °[id, I] 

Here are some of the intermediate expressions an FP 
system would obtain in evaluating ! : 2: 

!:2 => (eqO — » 7; x°[id, !°subl]):2 

=> x°[id, !°subl]:2 
=> X:<id:2, !°subl :2> => X:<2, !:7> 

=> X:<2, X:</, \\ 0 » 
=> X:<2, X:<7,7:0» => x:<2, x:<7,7» 

=» x:<2.7>=> 2. 

In Section 12 we shall see how theorems of the algebra 
of FP programs can be used to prove that ! is the 
factorial function. 

113.2 Inner product. We have seen earlier how this 
definition works. 

Def IP = (/+ )°(aX)°trans 

11.3.3 Matrix multiply. This matrix multiplication 
program yields the product of any pair <m,n> of con- 
formable matrices, where each matrix m is represented 
as the sequence of its rows: 

m = <m\, ... , m r > 

where m x = <m x \, ... , m^> for i = 1 r. 

Def MM = (aaIP)°(adistl)°distr°[l, trans°2] 

The program MM has four steps, reading from right to 
left; each is applied in turn, beginning with [I, trans^2], 
to the result of its predecessor. If the argument is <m.n>< 
then the first step yields <m,n'> where n' = transit The 
second step yields «mi,n'>, ... , <m r ,n'» . where ihe 
m x are the rows of m. The third step, adistl, yields 

<distl:</Wi,/i'>, ... , distl:<m r ,n'» = <p\ p r > 

where 
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p { = distl:</Wi,/i'> = «m x ,n\>, ... , <m u n s '» 

for i = 1, ... , r 

and n{ is the jth column of n (the jth row of ri). Thus p x , 
a sequence of row and column pairs, corresponds to the 
i-th product row. The operator aalP, or a(aIP), causes 
alP to be applied to each p x , which in turn causes IP to 
be applied to each row and column pair in each p x . The 
result of the last step is therefore the sequence of rows 
comprising the product matrix. If either matrix is not 
rectangular, or if the length of a row of m differs from 
that of a column of n, or if any element of m or n is not 
a number, the result is _L. 

This program MM does not name its arguments or 
any intermediate results; contains no variables, no loops, 
no control statements nor procedure declarations; has no 
initialization instructions; is not word-at-a-time in na- 
ture; is hierarchically constructed from simpler compo- 
nents; uses generally applicable housekeeping forms and 
operators (e.g., af, \ distl, distr, trans); is perfectly general; 
yields _L whenever its argument is inappropriate in any 
way; does not constrain the order of evaluation unnec- 
essarily (all applications of IP to row and column pairs 
can be done in parallel or in any order); and, using 
algebraic laws (see below), can be transformed into more 
“efficient” or into more “explanatory” programs (e.g., 
one that is recursively defined). None of these properties 
hold for the typical von Neumann matrix multiplication 
program. 

Although it has an unfamiliar and hence puzzling 
form, the program MM describes the essential operations 
of matrix multiplication without overdetermining the 
process or obscuring parts of it, as most programs do; 
hence many straightforward programs for the operation 
can be obtained from it by formal transformations. It is 
an inherently inefficient program for von Neumann 
computers (with regard to the use of space), but efficient 
ones can be derived from it and realizations of FP 
systems can be imagined that could execute MM without 
the prodigal use of space it implies. Efficiency questions 
are beyond the scope of this paper; let me suggest only 
that since the language is so simple and does not dictate 
any binding of lambda-type variables to data, there may 
be better opportunities for the system to do some kind of 
“lazy” evaluation [9, 10] and to control data management 
more efficiently than is possible in lambda-calculus 
based systems. 

11.4 Remarks About FP Systems 

11.4.1 FP systems as programming languages. FP 
systems are so minimal that some readers may find it 
difficult to view them as programming languages. 
Viewed as such, a function / is a program, an object x is 
the contents of the store, and f:x is the contents of the 
store after program/ is activated with x in the store. The 
set of definitions is the program library. The primitive 
functions and the functional forms provided by the 
system are the basic statements of a particular program- 
ming language. Thus, depending on the choice of prim- 
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itive functions and functional forms, the FP framework 
provides for a large class of languages with various styles 
and capabilities. The algebra of programs associated 
with each of these depends on its particular set of func- 
tional forms. The primitive functions, functional forms, 
and programs given in this paper comprise an effort to 
develop just one of these possible styles. 

11.4.2 Limitations of FP systems. FP systems have 
a number of limitations. For example, a given FP system 
is a fixed language; it is not history sensitive: no program 
can alter the library of programs. It can treat input and 
output only in the sense that x is an input and f:x is the 
output. If the set of primitive functions and functional 
forms is weak, it may not be able to express every 
computable function. 

An FP system cannot compute a program since func- 
tion expressions are not objects. Nor can one define new 
functional forms within an FP system. (Both of these 
limitations are removed in formal functional program- 
ming (FFP) systems in which objects “represent” func- 
tions.) Thus no FP system can have a function, apply, 
such that 

apply \<x,y> = x:y 

because, on the left, x is an object, and, on the right, x 
is a function. (Note that we have been careful to keep 
the set of function symbols and the set of objects distinct: 
thus 1 is a function symbol, and 1 is an object.) 

The primary limitation of FP systems is that they are 
not history sensitive. Therefore they must be extended 
somehow before they can become practically useful. For 
discussion of such extensions, see the sections on FFP 
and AST systems (Sections 13 and 14). 

11.4.3 Expressive power of FP systems. Suppose two 
FP systems, FP! and FP 2 , both have the same set of 
objects and the same set of primitive functions, but the 
set of functional forms of FPi properly includes that of 
FP 2 . Suppose also that both systems car* express all 
computable functions on objects. Nevertheless, we can 
say that FPi is more expressive than FP 2 , since every 
function expression in FP 2 can be duplicated in FP,, but 
by using a functional form not belonging to FP 2 , FP, can 
express some functions more directly and easily than 
FP 2 . 

I believe the above observation could be developed 
into a theory of the expressive power of languages in 
which a language A would be more expressive than 
language B under the following roughly stated condi- 
tions. First, form all possible functions of all types in A 
by applying all existing functions to objects and to each 
other in all possible ways until no new function of any 
type can be formed. (The set of objects is a type: the set 
of continuous functions [T->U] from type T to type U is 
a type. If/E[T— >U] and /ET, then ft in U can be formed 
by applying /to /.) Do the same in language B Next, 
compare each type in A to the corresponding type in B. 
If, for every type, A’s type includes B’s corresponding 
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type, then A is more expressive than B (or equally 
expressive). If some type of A’s functions is incomparable 
to B’s, then A and B are not comparable in expressive 
power. 

11.4.4 Advantages of FP systems. The main reason 
FP systems are considerably simpler than either conven- 
tional languages or lambda-calculus-based languages is 
that they use only the most elementary fixed naming 
system (naming a function in a definition) with a simple 
fixed rule of substituting a function for its name. Thus 
they avoid the complexities both of the naming systems 
of conventional languages and of the substitution rules 
of the lambda calculus. FP systems permit the definition 
of different naming systems (see Sections 13.3.4 and 
14.7) for various purposes. These need not be complex, 
since many programs can do without them completely. 
Most importantly, they treat names as functions that can 
be combined with other functions without special treat- 
ment. 

FP systems offer an escape from conventional word- 
at-a-time programming to a degree greater even than 
APL [12] (the most successful attack on the problem to 
date within the von Neumann framework) because they 
provide a more powerful set of functional forms within 
a unified world of expressions. They offer the opportu- 
nity to develop higher level techniques for thinking 
about, manipulating, and writing programs. 


12. The Algebra of Programs for FP Systems 
12.1 Introduction 

The algebra of the programs described below is the 
work of an amateur in algebra, and I want to show that 
it is a game amateurs can profitably play and enjoy, a 
game that does not require a deep understanding of logic 
and mathematics. In spite of its simplicity, it can help 
one to understand and prove things about programs in 
a systematic, rather mechanical way. 

So far, proving a program correct requires knowledge 
of some moderately heavy topics in mathematics and 
logic: properties of complete partially ordered sets, con- 
tinuous functions, least fixed points of functionals, the 
first-order predicate calculus, predicate transformers, 
weakest preconditions, to mention a few topics in a few 
approaches to proving programs correct. These topics 
have been very useful for professionals who make it their 
business to devise proof techniques; they have published 
a lot of beautiful work on this subject, starting with the 
work of McCarthy and Floyd, and, more recently, that 
of Burstall, Dijkstra, Manna and his associates, Milner, 
Morris, Reynolds, and many others. Much of this work 
is based on the foundations laid down by Dana Scott 
(denotational semantics) and C. A. R. Hoare (axiomatic 
semantics). But its theoretical level places it beyond the 
scope of most amateurs who work outside of this spe- 
cialized field. 

If the average programmer is to prove his programs 
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correct, he will need much simpler techniques than those 
the professionals have so far put forward. The algebra of 
programs below may be one starting point for such a 
proof discipline and, coupled with current work on al- 
gebraic manipulation, it may also help provide a basis 
for automating some of that discipline. 

One advantage of this algebra over other proof tech- 
niques is that the programmer can use his programming 
language as the language for deriving proofs, rather than 
having to state proofs in a separate logical system that 
merely talks about his programs. 

At the heart of the algebra of programs are laws and 
theorems that state that one function expression is the 
same as another. Thus the law [f,g]°h = [f°h, g°h ] says 
that the construction of / and g (composed with h) is the 
same function as the construction of (/ composed with 
h) and (g composed with h) no matter what the functions 
/, g , and h are. Such laws are easy to understand, easy to 
justify, and easy and powerful to use. However, we also 
wish to use such laws to solve equations in which an 
“unknown” function appears on both sides of the equa- 
tion. The problem is that if f satisfies some such equation, 
it will often happen that some extension /' of / will also 
satisfy the same equation. Thus, to give a unique mean- 
ing to solutions of such equations, we shall require a 
foundation for the algebra of programs (which uses 
Scott’s notion of least fixed points of continuous func- 
tionals) to assure us that solutions obtained by algebraic 
manipulation are indeed least, and hence unique, solu- 
tions. 

Our goal is to develop a foundation for the algebra 
of programs that disposes of the theoretical issues, so 
that a programmer can use simple algebraic laws and 
one or two theorems from the foundations to solve 
problems and create proofs in the same mechanical style 
we use to solve high-school algebra problems, and so 
that he can do so without knowing anything about least 
fixed points or predicate transformers. 

One particular foundational problem arises: given 
equations of the form 

f = po-+q 0 ;...\pi-+qi-,Ei(f), (I) 

where the p ’s and q’s are functions not involving / and 
Ei(/) is a function expression involving/, the laws of the 
algebra will often permit the formal “extension" of this 
equation by one more “clause” by deriving 

Ei (/) = /*+! -» q i+ i, E i+ i (/) (2) 

which, by replacing Ei(/) in (1) by the right side of (2). 
yields 

f = Po~* qo, ... qx+u Ej+i (/). (3) 

This formal extension may go on without limit. One 
question the foundations must then answer is: w hen can 
the least f satisfying (1) be represented by the infinite 
expansion 

f=po-*qo-,...\p n -*qn,... (4) 

in which the final clause involving / has been dropped. 
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so that we now have a solution whose right side is free 
of/'s? Such solutions are helpful in two ways: first, they 
give proofs of “termination” in the sense that (4) means 
that /:* is defined if and only if there is an n such that, 
for every i less than n, /?;:* = Fand p n :x = T and q n :x 
is defined. Second, (4) gives a case-by-case description 
of / that can often clarify its behavior. 

The foundations for the algebra given in a subsequent 
section are a modest start toward the goal stated above. 
For a limited class of equations its “linear expansion 
theorem” gives a useful answer as to when one can go 
from indefinitely extendable equations like (1) to infinite 
expansions like (4). For a larger class of equations, a 
more general “expansion theorem” gives a less helpful 
answer to similar questions. Hopefully, more powerful 
theorems covering additional classes of equations can be 
found. But for the present, one need only know the 
conclusions of these two simple foundational theorems 
in order to follow the theorems and examples appearing 
in this section. 

The results of the foundations subsection are sum- 
marized in a separate, earlier subsection titled “expan- 
sion theorems,” without reference to fixed point con- 
cepts. The foundations subsection itself is placed later 
where it can be skipped by readers who do not want to 
go into that subject. 

12.2 Some Laws of the Algebra of Programs 

In the algebra of programs for an FP system variables 
range over the set of functions of the system. The “op- 
erations” of the algebra are the functional forms of the 
system. Thus, for example, [fg]°h is an expression of 
the algebra for the FP system described above, in which 
/, g , and h are variables denoting arbitrary functions of 
that system. And 

[f,g]°h = [f°h, g°h] 

is a law 6f the algebra which says that, whatever func- 
tions one chooses for f g , and h, the function on the left 
is the same as that on the right. Thus this algebraic law 
is merely a restatement of the following proposition 
about any FP system that includes the functional forms 
[/>g] and/°g: 

Proposition: For all functions /, g , and h and all objects 

(\J’g]° h ) :x = lf° h > 

Proof: 

i[fg\ oh )' x = [/$]:(*:*) 

by definition of composition 
= <f:(h:x), g:(h:x)> 

by definition of construction 
= <{f°h):x, (g°h):x> 

by definition of composition 

= [/•*, g°h]:x 

by definition of construction □ 

Some laws have a domain smaller than the domain 
of ail objects. Thus l°[/,g] = / does not hold for objects 
x such that g:x — _L. We write 
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defined 0 # — » 1° [fg] = f 

to indicate that the law (or theorem) on the right holds 
within the domain of objects x for which defined °g:x 
= T. Where 

Def defined = T 

i.e. defined:* = *=X —> X; T. In general we shall write 
a qualified functional equation : 

p^^f= g 

to mean that, for any object *, whenever p\x— T, then 
f'x = g:x. 

Ordinary algebra concerns itself with two operations, 
addition and multiplication; it needs few laws. The al- 
gebra of programs is concerned with more operations 
(functional forms) and therefore needs more laws. 

Each of the following laws requires a corresponding 
proposition to validate it. The interested reader will find 
most proofs of such propositions easy (two are given 
below). We first define the usual ordering on functions 
and equivalence in terms of this ordering: 

Definition /<# iff for all objects *, either /:* = X, or 
f:x = g:x. 

Definition f = g iff /<# and gsf. 

It is easy to verify that < is a partial ordering, that /<# 
means g is an extension of /, and that f =g iff/:* = g:x 
for all objects *. We now give a list of algebraic laws 
organized by the two principal functional forms in- 
volved. 

I Composition and construction 

I. 1 [/], ... ,/n]°g = [fag, ... ,fag] 

1.2 af°[g u ... , g„] = ... ,/°#„] 

13 - . gn] 

=/°[gi. /f°[g 2 , ... , gn]] when n>2 

= /°[gl,/°[g2, ... ,f°[gn- 1, gn]...]] 

//°[g] - g 

1-4 f°[x,g] = (bu fx)og 

1.5 lo [f u ... ,/„] </, 

s°[/i, ... ,/ s , ... ,/ n ] S / 8 for any selector s. s<n 
defined®/; (for all i^s, l<i<n) — » 

*“[/. /n] =/» 

1.5.1 [yi°l, ... ,fan]°[gu ... , g„] = [fag U ... ,/ n °gn] 

1.6 tl°[/i] < ^ and 

tl°[/i, - ,/n] ^ [/a, ... ,/„] for n>2 
defined 0 /] > tl°[/i] = <j> 
and tl°[/i, ... ,/ n ] = [f 2 , ... ,/„] for n>2 

1.7 distl°[/, [gi, ... , g„]] = [[/g,], ... , [/gn]] 
defined 0 /— distl°[/,<^] 

The analogous law holds for distr. 

1.8 apndl°[/, [g,, ... , g„]] = [/,g, ^„] 

null°g — »— » apndl°[/,g] = [/] 

And so on for apndr, reverse, rotl, etc. 

1.9 [... , i, ...] - I 

I- 10 apndl° [f°g y af°h] = q/°apndl°[#,/i] 

1. 1 1 pair & not o null° 1 — » 

apndl°[[l°l,2], distr°[tl° 1,2]] = distr 

Communications August 1978 

°f Volume 21 

the ACM Number 8 


Proposition 2 


i 


c 

A< 

Al 

Bi 

C( 

Di 


E 

Al 

Al 

Ai 

A 

A 

A 

B 

B 

B 

B 

B 

B 

B 

B 

C 

C 

C 

C 

C 

C 

C 

C 

C 

C 

C 

C 

C 

c 

c 

c 

c 

c 

c 


Where f&g = and°[/,g]; 

pair = atom — > F\ eq° [length, 2] 

II Composition and condition (right associated paren- 
theses omitted) (Law II.2 is noted in Manna et al. [16], 
p. 493.) 

II. 1 (/—»/; g)°h — P°h /°A; g°h 

11.2 h°(p—*fi g)=p-^> A°/; hog 

11. 3 or°[< 7 ,not°< 7 ] — ►— > and °[p,q] —>f\ 

and°[/7,not°<7] — > g\ h = /?— ► (q—*fi g)\ h 

II 3.1 p — » (p—*f\ g)\h=p — */; h 

III Composition and miscellaneous 

111.1 x°f<- x 

defined 0 / — »— » x°f = x 

111.1. 1 ±ofmfo± = i 

111. 2 /°id = id°/=/ 

111.3 pair — ►— > l°distr = [1° 1,2] also: 

pair — > l°tl = 2 etc. 

111.4 a(f°g) = af ° ag 

111. 5 null °g — » — » o/°g = <£ 

IV Condition and construction 

IV. 1 [fu...,(p->g;h),„.,fn\ 

= p-» [fu ... , g, - ,/J; [/i h /„] 

IV. 1.1 [/, ... , (pi -> gu ... ; Pn -> gn\ h), ... ,/ m ] 

= Pi —+ [fu ... » gu ... ->fm]\ 

... j Pn —> [fu .-. * gn, .*. »/m]i [fl h , ... ,/m] 

This concludes the present list of algebraic laws; it is by 
no means exhaustive, there are many others. 

Proof of two laws 

We give the proofs of validating propositions for laws 
1. 10 and 1. 1 1, which are slightly more involved than most 
of the others. 

Proposition 1 

apndl ° [fog , of Q h] = af ° apndl ° [g,h] 

Proof. We show that, for every object x , both of the 
above functions yield the same result. 

Case 1. h:x is neither a sequence nor <£. 

Then both sides yield 1 when applied to x. 

Case 2. h:x = $. Then 
apndl°[/°g, af°h ]: x 

= apndl: <fog:x , <f>> = <f:(g:x)> 
q/°apndl°[g,/z]: x 

= afo apndl: <g-.x , <£> = af:<g:x> 

= <f(g'x)> 

Case 3. h:x = <y u ... , y n >. Then 

apndl°[/°g, afok]\ x 

= apndl: </°g:x , a/: <y lf ...,y n » 

= <f(g-x\fyu ... ,/>> 

q/°apndl°[g,A]: x 

= a/° apndl: <g:x, <y 1? ... , j/ n » 

= af:<g-.x,y u ... ,/ n > 

= <f(g'*),fyu ... ,//n> □ 
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Pair & not ° null° 1 — > — > 

apndl°[[l 2 , 2], distr°[tl°l, 2]] = distr 

where f&g is the function: and°[/ g], and / 2 = />/ 
Proof. We show that both sides produce the same result 
when applied to any pair <x,y>, where x ^ <j>, as per the 
stated qualification. 

Case 1. x is an atom or J_. Then distr: <x,y> = 1, since 
x ^ </>. The left side also yields 1 when applied to <x,y>, 
since tl° l:<x,y> = _L and ail functions are 1-preserving. 
Case 2. x — <x u ... , x n >. Then 

apndl°[[l 2 , 2], distr°[tl°l, 2]]:<x, y> 

= apndl: «1 :x,y>, distr: <tl:x, y» 

= apndl: «x\,y>, <#>> = «xi,y» if tl:x = <j) 

= apndl: «xi ,y>, «X 2 ,y> , ... , <x n ,>’>» 

if tl:x ^ (f> 

= «x u y>, ... , <x n ,y» 

= distr: <x,y> □ 

12.3 Example: Equivalence of Two Matrix 
Multiplication Programs 

We have seen earlier the matrix multiplication pro- 
gram: 

Def MM = aalP ° adistl ° distr ° [1, trans°2]. 

We shall now show that its initial segment, MM', where 
Def MM' aalP ° adistl ° distr, 

can be defined recursively. (MM' “multiplies'’ a pair of 
matrices after the second matrix has been transposed. 
Note that MM', unlike MM, gives _L for all arguments 
that are not pairs.) That is, we shall show that MM' 
satisfies the following equation which recursively defines 
the same function (on pairs): 

/ = null°l — > apndl°[aIP°distl o [l o 1, 2],/°[tl°l, 2]]. 

Our proof will take the form of showing that the follow- 
ing function, R, 

Def R = null° 1 — » <j>; 

apndl°[aIP°distl o [l o 1, 2], MM'°[tl°l, 2]] 

is, for all pairs <x,y> , the same function as MM'. R 
“multiplies” two matrices, when the first has more than 
zero rows, by computing the first row of the “product" 
(with aIP°distl°[l° 1, 2]) and adjoining it to the “prod- 
uct” of the tail of the first matrix and the second matrix. 
Thus the theorem we want is 

pair — ►— » MM' = R, 

from which the following is immediate: 

MM = MM' ° [1, trans°2] = R ° [1, trans°2]; 
where 

Def pair = atom — * F\ eq° [length, 2\. 

Theorem: pair— ►- » MM' = R 
where 
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Def MM' = aalP ° adistl ° distr 
Def R = null° 1 — ► fa 

apndl°[aIP°distl°[l 2 , 2], MM'°[tl°l, 2]] 

Proof. 

Case 1. pair & null°l — ► MM' = R. 

pair & null° 1 — >— > R = <£ by def of R 
pair & null° 1 > > MM' = 4> 

since distr: <fax> = <£ by def of distr 
and af:<t> = by def of Apply to all. 

And so: aalP ° adistl ° distr: <fax> = <f>. 

Thus pair & null 0 1 — > MM' = R. 

Case 2. pair & not°null°l — »— > MM' = R. 

pair & not°null°l — >— > R = R', (1) 

by def of R and R', where 

Def R' = apndl°[aIP°distl°[l 2 , 2], MM'°[tl°l, 2]]. 

We note that 

R' = apndl°[/°g, af°h] 

where 

f = aIP°distl 
g=[ 1 2 ,2] 
h = distr°[tl° 1, 2] 

of = a(aIP°distl) = aaIP°adistl (by III.4). (2) 

Thus, by 1. 10, 

R' = a/°apndl°[£,/z]. (3) 

Now apndl°[g,/i] = apndl°[[l 2 , 2], distr°[tl°l, 2]], 
thus, by 1.1 1, 

pair & not°null° 1 — ► apndl°[g,/i] = distr. (4) 

And so we have, by (1), (2), (3) and (4), 

pair & not°null° 1 — »— > R = R' 

= o/°distr = aaIP°adistl°distr = MM'. 

Case 1 and Case 2 together prove the theorem. □ 

12.4 Expansion Theorems 

In the following subsections we shall be “solving” 
some simple equations (where by a “solution” we shall 
mean the “least” function which satisfies an equation). 
To do so we shall need the following notions and results 
drawn from the later subsection on foundations of the 
algebra, where their proofs appear. 

12.4.1 Expansion. Suppose we have an equation of 
the form 

/ s E(/) (El) 

where E (/) is an expression involving/. Suppose further 
that there is an infinite sequence of functions/ for / = 0, 

1, 2, ... , each having the following form: 

/o - i 

/+1 = po qo, ... ; pi -► i (£2) 
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where the p\s and q?s are particular functions, so that E 
has the property: 

E(/) =/+ 1 for i = 0, 1, 2, ... (E3) 

Then we say that E is expansive and has the /’ s as 
approximating functions. 

If E is expansive and has approximating functions as 
in (E2), and if /is the solution of (El), then /can be 
written as the infinite expansion 

f= po -> qo\ ... ; pn -» q n ; ... (E4) 

meaning that, for any x, f:x ^ ± iff there is an n > 0 
such that (a) p:.x = F for all i < n, and (b) p n :x = T, and 
(c) q n :x -L. When f:x # ±, then f:x = q n :x for this n. 
(The foregoing is a consequence of the “expansion theo- 
rem”.) 

12.4.2 Linear expansion. A more helpful tool for 
solving some equations applies when, for any function h , 


E(/i) =p 0 -+ qo, Ei (h) (LEI) 

and there exist /?j and q x such that 

Ei (p\ -> q\, h) a P'i+i q x +\, Ei(/z) 

fori = 0, 1, 2, ... (LE2) 

and 

Ei(I) = I. (LE3) 

Under the above conditions E is said to be linearly 
expansive. If so, and / is the solution of 

f= E (/) (LE4) 

then E is expansive and / can again be written as the 
infinite expansion 

f=Po^qo;...;p a ^q n ;... (LE5) 


using the p { s and qf s generated by (LEI) and (LE2). 

Although the p ? s and qfs of (E4) or (LE5) are not 
unique for . a given function, it may be possible to find 
additional constraints which would make them so, in 
which case the expansion (LE5) would comprise a can- 
onical form for a function. Even without uniqueness 
these expansions often permit one to prove the equiva- 
lence of two different function expressions, and they 
often clarify a function’s behavior. 

12.5 A Recursion Theorem 

Using three of the above laws and linear expansion, 
one can prove the following theorem of moderate gen- 
erality that gives a clarifying expansion for many recur- 
sively defined functions. 

Recursion theorem: Let / be a solution of 

f-P^gr> Q(f) ( 1 ) 

where 

Q(k) = h°[i , k°j] for any function k (2) 

and /?, g , A, i,j are any given functions, then 
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f=p-*s> p°J Q(s); - ; p° T -* Q"(s); - ( 3 ) 

(where Q"(g) is h°[i, Q n ~ l (g)°j], and j n is j°j n ~' for 
n > 2) and 

Q n (g) = /h o [i, i°j , ... , i°r\ g°j n l (4) 

Proof. We verify that p g-, Q (/) is linearly expansive. 
Let p n , q n and k be any functions. Then 

Q(/>n -» qn, k) 

= h°[i, (p n -* q n ; k)°j ] by (2) 

= h°[i, {pn°j—> q n °j\ k°j )] by II. 1 
= h°(p n °j-> [/', q n °j]\ [»', k°j . |) by IV. 1 
= Pn°j — • h°[i, q n °j]-, h°[i, k°j] by II.2 
= Pn°j —> Q(?n); Q(k) by (2) (5) 

Thus if p 0 = p and q 0 = g, then (5) gives pi = p°j and 
q x = Q(g) and in general gives the following functions 
satisfying (LE2) 

pn = P°f and ?n = Q n (g)- 

Finally, 

Q(I) = h°[i, L°j] 

= h°\i, i] by III. 1.1 

= h°l by 1.9 

= i by III. 1.1. (7) 

Thus (5) and (6) verify (LE2) and (7) verifies (LE3), with 

Ei = Q. If we let Ef/) ■ p -» g; Q (/), then we have 
(LEI); thus E is linearly expansive. Since /is a solution 
of/= E (/), conclusion (3) follows from (6) and (LE5). 
Now 

Q n (g) = ho[i, Q n ~\g)°j] 

= h°[i, h°[i°j, ... , h°[i°j " , g°j n ] ... ]] 

by 1.1, repeatedly 

= /h°[i, i°j , ... , i°j n ~\ g°j n ] by 1.3 (8) 

Result (8) is the second conclusion (4). □ 

12.5.1 Example: correctness proof of a recursive 
factorial function. Let / be a solution of 

/ = eqO — * /; x°[id,/°$] 

where 

Def s = -°[id, 7] (subtract 1). 

Then/ satisfies the hypothesis of the recursion theorem 
with p = eqO, g = 1, h = X, i = id, and j = s. Therefore 

/ = eqO — » 7; ... ; eqO°s" — > Q"(7); ... 

and 

Q n (7) s /x » [id, id°j, ... , id°j n_1 , 7°s n ]. 

Now id°s k — 5 k by III.2 and eqO°s n > > 7°s n = 7 by 
III. 1, since eqO°s n :x implies defmed°5 n :x; and also 
eqO°s n :x = eqO: (x - n) = x=n. Thus if eqO°j n : x = T, 
then x = n and 

Q n (7): n = n X (n - 1) X ... X (n - (n - 1)) 

X (7: (n - n)) = n!. 
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Using these results for 7°i", eq0°5", and Q n (7) in the 
previous expansion for /, we obtain 

f:x = x=0 —> 7; ... ; x=n 

-» n x (n - 1) x ... x 1 x 1; ... 

Thus we have proved that / terminates on precisely the 
set of nonnegative integers and that it is the factorial 
function thereon. 

12.6 An Iteration Theorem 

This is really a corollary of the recursion theorem. It 
gives a simple expansion for many iterative programs. 

Iteration theorem: Let /be the solution (i.e., the least 
solution) of 

f=p^g-,h°f°k 

then 

p° k -* h °g ° k ; - ; p ° k " hn °g° k "' - 
Proof. Let in' = h° 2, i' = id, f = k , then 

f = p-*g-,h'°[f,f°f] 

since /i°2°[id,/°fc] = h°f°k by 1.5 (id is defined except 
for X, and the equation holds for 1). Thus the recursion 
theorem gives 

p -*■ &■■■ ; p okr -*• Q n te); - 

where 

Q n (^) = 7i°2°[id, Q n ‘ 1 (^)»/c] 

3 h°Q n ~\g)°k = h n °g°kp 
by 1.5 □ 

12.6.1 Example: Correctness proof for an iterative 
factorial function. Let /be the solution of 

/ = eqO° 1 -*• 2 ; /°[j° 1, X] 

where Def s = -°[id, 7] (substract 1). We want to prove 
that f:<x,l> = x! iff x is a nonnegative integer. Let p = 
eqO° 1, g = 2, 7» s id, k = [s° 1, X]. Then 

f = P~*g' k °f° k 
and so 

f s p^g\-'fP°^-*g o1 ^- (l) 

by the iteration theorem, since h n = id. We want to show 
that 

pair — ►— * k? = [ a n , b n ] (2) 

holds for every n > 1, where 

an^^l (3) 

b n = /x ojy-M jo 1,1,2] (4) 

Now (2) holds for n = 1 by definition of k. We assume 
it holds for some n > 1 and prove it then holds lor 
n + 1. Now 

pair-*— » k n+ ' = k°kP = [j° 1, X]°[a n , 7> n ] (5) 

since (2) holds for n. And so 
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pair -*-* k n+l = [j°a n , Xo[a n , b n ]] by 1.1 and 1.5 (6) 

To pass from (5) to (6) we must check that whenever a„ 
or b n yield ± in (5), so will the right side of (6). Now 

s°a n — s’ n+1 °l = a„+i (7) 

X°[a n , bn] = /x o [j-o 1, i-'o 1, ... , JO 1, 1, 2] 

= bn+\ by 1.3. (8) 

Combining (6), (7), and (8) gives 

pair — A J1+1 = [a n+1( 6 n+ i], (9) 

Thus (2) holds for n = 1 and holds for n + 1 whenever 
it holds for n, therefore, by induction, it holds for every 
n > 1. Now (2) gives, for pairs: 

defined 0 ^ — p°k n = eqO° 1 °[a n , b n ] 

= eqO°a n = eqOoj"® 1 (10) 

defined°k" ->— > g° k” 

= 2°[a n , b n ] = /X o ... , joj, i, 2] (11) 

(both use 1.5). Now (1) tells us that f:<x,l> is defined iff 
there is an n such that p°k':<x,I> = F for all i < n, and 
p°k?:<x,l> = T, that is, by (10), eq0°j":x = T, i.e., 
x=n; and g%":<x,/> is defined, in which case, by (1 1), 

fi<x,l> = /x:</, 2, ... , x-1, x, 1 > = n!, 

which is what we set out to prove. 

12.6.2 Example: proof of equivalence of two iterative 
programs. In this example we want to prove that two 
iteratively defined programs,/ and g, are the same func- 


tion. Let / be the solution of 

f — />° 1 — »■ 2; 1, 2]. (1) 

Let g be the solution of 

g = p°l -» 2\g°[k°\, h°2]. (2) 

Then, by the iteration theorem: 

f = po — * qo', ... Pn — * (fn\ ... (3) 

S = p'o ?o: ... ; p’n -» q'n , ... (4) 


where (letting r° = id for any r ), for n = 0, 1, ... 

p n = p°\°[k°\, 2] n s r io[^i ) 2] by 1.5.1 (5) 

q n = A n °2°[A:ol, 2] n = h n °2°[k?'°\, 2] by 1.5.1 (6) 

p'n = p°l°[k°l,h°2] n =polo[k*oih n o2] by 1.5.1 (7) 

q'n = 2°[k°l, A°2] n = 2°[A J ’ol, h n °2] by 1.5.1. (8) 


Now, from the above, using 1.5, 


defined® 2 ->-» p n = po/fo j 
defmed°/» n o2 — ► p’ n = poppo \ 
defined °r° 1 q n = q' n = A"° 2 

(9) 

(10) 

(11) 

Thus 


definedoA"o2 defined °2 = f 

definedo/i n o2, -+-► p n = p ’ n 

(12) 

(13) 

and 


/ = po qr 0 ; ... ; p n h n °2] ... 

Z=P'o-+q'o; ... ;p' n ^>h a <>2 ; ... 

(14) 

(15) 
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since p n and p' n provide the qualification needed for q n 
= q'n = h n °2. 

Now suppose there is an x such that/:x & g:x. Then 
there is an n such that ptx = p[:x = F for i < n, and p n :x 
^ Pn-X. From (12) and (13) this can only happen when 
h"°2:x = ±. But since h is -L -preserving, h m °2:x = 1 for 
all m > n. Hence fix = g-.x = ± by (14) and (15). This 
contradicts the assumption that there is an x for which 
fi.x & g:x. Hence / = g. 

This example (by J. H. Morris, Jr.) is treated more 
elegantly in [16] on p. 498. However, some may find that 
the above treatment is more constructive, leads one more 
mechanically to the key questions, and provides more 
insight into the behavior of the two functions. 


12.7 Nonlinear Equations 

The preceding examples have concerned “linear” 
equations (in which the “unknown” function does not 
have an argument involving itself). The question of the 
existence of simple expansions that “solve” “quadratic” 
and higher order equations remains open. 

The earlier examples concerned solutions of fi= E (/), 
where E is linearly expansive. The following example 
involves an E (/) that is quadratic and expansive (but 
not linearly expansive). 

12.7.1 Example: proof of idempotency ([16] p. 497). 
Let f be the solution of 


/= E(/) =/>-» id;/ 2 °/t. (1) 

We wish to prove that / = / 2 . We verify that E is 
expansive (Section 12.4.1) with the following approxi- 
mating functions: 


/o-i 

f n ~ P * id ; - ; P°h" 1 -» h"-\ I for n > 0 
First we note that p ->-»/„ = id and so 

poh' fn°h* = h'. 

Now E(/ 0 ) = p -*• id; l 2 oh =/, 
and 


(2a) 

(2b) 

(3) 

(4) 


E(/n) 

= P -* id; /„»(/> -► id; ... ; poh n ~ l — h n ~ l ; ±)*h 
s P id;f n o(poh -> /»; ... ; poh" -► h"\ Loh) 

-/>-> Ml /»•*-*/„.* ... J p°h n * fin° h n ' y fn ° _L 

= /? — » id; poh -* h; ... ; p°h n — » A"; I by (3) 

= /n+l. (5) 

Thus E is expansive by (4) and (5); so by (2) and Section 
12.4.1 (E4) 


f=p-> id; ... ; poh" -► h"; ... . (6) 

But (6), by the iteration theorem, gives 

f~P^* id;/°A. ( 7) 

Now, \Sp.x = T, then fi.x = x = / 2 :x, by (1). p.x = F, 
then 

fix = f 2 °h:x by (1) 
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= f:(f°h:x)=f.(f:x) by (7) 

= / 2 :x. 

If p:x is neither T nor F , then f:x = _L = / 2 :x. Thus 

/-/ 2 * 

12.8 Foundations for the Algebra of Programs 

Our purpose in this section is to establish the validity 
of the results stated in Section 12.4. Subsequent sections 
do not depend on this one, hence it can be skipped by 
readers who wish to do so. We use the standard concepts 
and results from [16], but the notation used for objects 
and functions, etc., will be that of this paper. 

We take as the domain (and range) for all functions 
the set O of objects (which includes _L) of a given FP 
system. We take F to be the set of functions, and F to be 
the set of functional forms of that FP system. We write 
E (/) for any function expression involving functional 
forms, primitive and defined functions, and the function 
symbol /; and we regard E as a functional that maps a 
function / into the corresponding function E (/). We 
assume that all / E F are _L -preserving and that all 
functional forms in F correspond to continuous function- 
als in every variable (e.g., [/, g] is continuous in both / 
and g). (All primitive functions of the FP system given 
earlier are ± -preserving, and all its functional forms are 
continuous.) 

Definitions. Let E (/) be a function expression. Let 

/o - I . 

/+ 1 = po — * qo\ ••• j pi * q~\, -L for i 0, 1, ... 

where p x , q x E F. Let E have the property that 

E (/) m f M for i = 0, 1 

Then E is said to be expansive with the approximating 
functions f. We write 

f — po — » qo\ ••• ; pn tjn, 

to mean that / = limi{/}, where the / have the form 
above. We call the right side an infinite expansion of /. 
We take f:x to be defined iff there is an n > 0 such that 
(a) p x \x = F for all i < n, and (b) p n :x = T, and (c) q n :x 
is defined, in which case f:x = q n :x. 

Expansion theorem: Let E (/) be expansive with ap- 
proximating functions as above. Let / be the least func- 
tion satisfying 

/« E(/). 

Then 

/ s Po * qtf, ••• j pn —■ * qn\ 

Proof. Since E is the composition of continuous func- 
tionals (from F) involving only monotonic functions 
(J.-preserving functions from F) as constant terms, E is 
continuous ([16] p. 493). Therefore its least fixed point / 
is limitEXi)} = limi{/} ([16] P- 494), which by definition 
is the above infinite expansion for /. □ 
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Definition. Let E (/) be a function expression satisfying 
the following: 

E(/i) s p 0 — ► q 0 \ Ei (h) for all h E F (LEI) 

where p x E F and ^EF exist such that 

Ei(/7i -> q{, h ) = p x +i fi+i; Ei(/z) 

for all h E F and i = 0, 1, ... (LE2) 

and 

Ei(I) = I. (LE3) 

Then E is said to be linearly expansive with respect to 
these p 's and q' s. 

Linear expansion theorem: Let E be linearly expansive 
with respect to p\ and q x , i = 0, 1, ... . Then E is expansive 


with approximating functions 

/o-I _ 0) 

/+ i = po * qo\ ••• \ p\ * qi\ -L* (2) 

Proof. We want to show that E(/) = /+ 1 for any i > 0. 
Now 

E(/o) = po — > qo\ Ei (-L) = po —* qo\ -L =/i (3) 

by (LEI) (LE3) (l). 

Let i > 0 be fixed and let 

f s po -* qo, H>1 (4a) 

Wi=pi-> qi, w 2 (4b) 

etc. 

hv -1 = p,-i — > q,-u i. (4-) 

Then, for this i > 0 
E(f) = po qo, Ei (fi) by (LEI) 


Ei(/i) = p\ -* q 1; Ei(>vi) by (LE2) and (4a) 

Ei(h'i) = p 2 ~* qi, Ei(>V 2) by (LE2) and (4b) 

etc. 

Ei(Wi-i) = p x q x \ Ei (I) by (LE2) and (4-) 

= Pi qi\ I by (LE3) 

Combining the above gives 

E(/) =/+i for arbitrary i > 0, by (2). (5) 

By (3), (5) also holds for i = 0; thus it holds for all i > 0. 
Therefore E is expansive and has the required approxi- 
mating functions. □ 

Corollary. If E is linearly expansive with respect to /?, 
and q x , i = 0, 1, ... , and /is the least function satisfying 

/- E (/) < LE4 > 

then 

/ = po — * qo\ ••• i Pn * qn, ••• • ( LE5 ) 

12.9 The Algebra of Programs for the Lambda C alculus 
and for Combinators 

Because Church’s lambda calculus [5] and the system 
of combinators developed by Schonfinkel and Curry [6] 
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are the primary mathematical systems for representing 
the notion of application of functions, and because they 
are more powerful than FP systems, it is natural to 
enquire what an algebra of programs based on those 
systems would look like. 

The lambda calculus and combinator equivalents of 
FP composition, /°g, are 

Xfgx.(figx)) = B 

where B is a simple combinator defined by Curry. There 
is no direct equivalent for the FP object <x,v> in the 
Church or Curry systems proper; however, following 
Landin [14] and Burge [4], one can use the primitive 
functions prefix, head, tail, null, and atomic to introduce 
the notion of list structures that correspond to FP se- 
quences. Then, using FP notation for lists, the lambda 
calculus equivalent for construction is hfgx.<fx,gx>. A 
combinatory equivalent is an expression involving prefix, 
the null list, and two or more basic combinators. It is so 
complex that I shall not attempt to give it. 

If one uses the lambda calculus or combinatory 
expressions for the functional forms fog and [f,g] to 
express the law I.i in the FP algebra, [f,g]°h = 
[f°h, g°h\, the result is an expression so complex that the 
sense of the law is obscured. The only way to make that 
sense clear in either system is to name the two function- 
als: composition = B, and construction s A, so that B fg 
= fog , and Afg = [/,#]. Then 1.1 becomes 

B(A fg)h - A(Bfh){Bgh\ 

which is still not as perspicuous as the FP law. 

The point of the above is that if one wishes to state 
clear laws like those of the FP algebra in either Church’s 
or Curry’s system, one finds it necessary to select certain 
functionals (e.g., composition and construction) as the 
basic operations of the algebra and to either give them 
short names or, preferably, represent them by some 
special notation as in FP. If one does this and provides 
primitives, objects, lists, etc., the result is an FP-like 
system in which the usual lambda expressions or com- 
binators do not appear. Even then these Church or Curry 
versions of FP systems, being less restricted, have some 
problems that FP systems do not have: 

a) The Church and Curry versions accommodate 
functions of many types and can define functions that 
do not exist in FP systems. Thus, Bf is a function that 
has no counterpart in FP systems. This added power 
carries with it problems of type compatibility. For ex- 
ample, in fog, is the range of g included in the domain 
of/? In FP systems all functions have the same domain 
and range. 

b) The semantics of Church’s lambda calculus de- 
pends on substitution rules that are simply stated but 
whose implications are very difficult to fully compre- 
hend. The true complexity of these rules is not widely 
recognized but is evidenced by the succession of able 
logicians who have published “proofs” of the Church- 
Rosser theorem that failed to account for one or another 
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of these complexities. (The Church-Rosser theorem, or 
Scott’s proof of the existence of a model [22], is required 
to show that the lambda calculus has a consistent seman- 
tics.) The definition of pure Lisp contained a related 
error for a considerable period (the “funarg” problem). 
Analogous problems attach to Curry’s system as well. 

In contrast, the formal (FFP) version of FP systems 
(described in the next section) has no variables and only 
an elementary substitution rule (a function for its name), 
and it can be shown to have a consistent semantics by a 
relatively simple fixed-point argument along the lines 
developed by Dana Scott and by Manna et al [16]. For 
such a proof see McJones [18]. 

12.10 Remarks 

The algebra of programs outlined above needs much 
work to provide expansions for larger classes of equations 
and to extend its laws and theorems beyond the elemen- 
tary ones given here. It would be interesting to explore 
the algebra for an FP-like system whose sequence con- 
structor is not .1-preserving (law 1.5 is strengthened, but 
IV. 1 is lost). Other interesting problems are: (a) Find 
rules that make expansions unique, giving canonical 
forms for functions; (b) find algorithms for expanding 
and analyzing the behavior of functions for various 
classes of arguments; and (c) explore ways of using the 
laws and theorems of the algebra as the basic rules either 
of a formal, preexecution “lazy evaluation” scheme [9, 
10], or of one which operates during execution. Such 
schemes would, for example, make use of the law 
1 °[fg] ^/to avoid evaluating g\x. 

13. Formal Systems for Functional Programming 
(FFP Systems) 

13.1 Introduction 

As we have seen, an FP system has a set of functions 
that depends on its set of primitive functions, its set of 
functional forms, and its set of definitions. In particular, 
its set of functional forms is fixed once and for all, and 
this set determines the power of the system in a major 
way. For example, if its set of functional forms is empty, 
then its entire set of functions is just the set of pnmitive 
functions. In FFP systems one can create new functional 
forms. Functional forms are represented by object se- 
quences; the first element of a sequence determines 
which form it represents, while the remaining elements 
are the parameters of the form. 

The ability to define new functional forms in FFP 
systems is one consequence of the principal difference 
between them and FP systems: in FFP systems objects 
are used to “represent” functions in a systematic way. 
Otherwise FFP systems mirror FP systems closely. They 
are similar to, but simpler than, the Reduction (Red) 
languages of an earlier paper [2]. 

We shall first give the simple syntax of FFP systems, 
then discuss their semantics informally, giving examples, 
and finally give their formal semantics. 
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13.2 Syntax 

We describe the set O of objects and the set E of 
expressions of an FFP system. These depend on the 
choice of some set A of atoms , which we take as given. 
We assume that T (true), F (false), $ (the empty se- 
quence), and # (default) belong to A, as well as “num- 
bers” of various kinds, etc. 

1) Bottom, ±, is an object but not an atom. 

2) Every atom is an object . 

3) Every object is an expression. 

4) If xi 9 ... , x n are objects [expressions], then 
<xi, ... , x n > is an object [resp., expression] called a 
sequence (of length n) for n > 1. The object [expression] 
jcj for 1 < i < n, is the ith element of the sequence 
<xi, ... , Xi, ... , x n >. (</> is both a sequence and an atom; 
its length is 0.) 

5) If x and y are expressions, then (x:y) is an expression 
called an application, x is its operator and y is its operand. 
Both are elements of the expression. 

6) If x = <Xi, ... , x n > and if one of the elements of x is 
_L, then x = ±. That is, <... , J_, ...> = J_. 

7) All objects and expressions are formed by finite use 
of the above rules. 

A subexpression of an expression x is either x itself or 
a subexpression of an element of x. An FFP object is an 
expression that has no application as a subexpression. 
Given the same set of atoms, FFP and FP objects are 
the same. 

13.3 Informal Remarks About FFP Semantics 

13.3.1 The meaning of expressions; the semantic 
function p. Every FFP expression e has a meaning , pe y 
which is always an object; \ie is found by repeatedly 
replacing each innermost application in e by its meaning. 
If this process is nonterminating, the meaning of e is JL. 
The meaning of an innermost application ( x:y ) (since it 
is innermost, x and y must be objects) is the result of 
applying the function represented by x toy, just as in FP 
systems, except that in FFP systems functions are rep- 
resented by objects, rather than by function expressions, 
with atoms (instead of function symbols) representing 
primitive and defined functions, and with sequences 
representing the FP functions denoted by functional 
forms. 

The association between objects and the functions 
they represent is given by the representation function , p, 
of the FFP system. (Both p and p belong to the descrip- 
tion of the system, not the system itself.) Thus if the 
atom NULL represents the FP function null, then 
pNULL = null and the meaning of (NULL: A) is 
p(NULL:A) = (pNULL):A = nuU:^ - F. 

From here on, as above, we use the colon in two senses. 
When it is between two objects, as in (NULL: A), it 
identifies an FFP application that denotes only itself; 
when it comes between a function and an object, as in 
(pNULL):A or nullr^f, it identifies an FP-like application 
that denotes the result of applying the function to the 
object. 

The fact that FFP operators are objects makes pos- 
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sible a function, apply, which is meaningless in FP 
systems: 

apply:<x,/> = (x:y). 

The result of apply:<x,/>, namely (x:y), is meaningless 
in FP systems on two levels. First, (x:y) is not itself an 
object; it illustrates another difference between FP and 
FFP systems: some FFP functions, like apply, map ob- 
jects into expressions, not directly into objects as FP 
functions do. However, the meaning of apply:<x,/> is 
an object (see below). Second, (x:y) could not be even an 
intermediate result in an FP system; it is meaningless in 
FP systems since x is an object, not a function and FP 
systems do not associate functions with objects. Now if 
APPLY represents apply, then the meaning of 
(APPLY:<NULL,A>) is 

p(APPLY:<NULL,A > ) 

= p((pA PPL Y ):<N U LL, A >) 

= p(stpply:<NULL,A>) 

= p(NULL:A) = p((pNULL):A) 

= p(nu\l:A) = pF = F. 

The last step follows from the fact that every object is its 
own meaning. Since the meaning function p eventually 
evaluates all applications, one can think of 
a PPly '<NULL y A> as yielding Feven though the actual 
result is (NULL:A). 

13.3.2 How objects represent functions; the repre- 
sentation function p. As we have seen, some atoms 
( primitive atoms) will represent the primitive functions of 
the system. Other atoms can represent defmed functions 
just as symbols can in FP systems. If an atom is neither 
primitive nor defmed, it represents I, the function which 
is JL everywhere. 

Sequences also represent functions and are analogous 
to the functional forms of FP. The function represented 
by a sequence is given (recursively) by the following rule. 

Metacomposition rule 

(p<x i, ... , x n >):y = (pxi):«xi, ... , x n >, y>, 

where the Xi’s and y are objects. Here pxi determines 
what functional form <xi, ... , x n > represents, 
and x 2 , ... , x n are the parameters of the form (in FFP, xi 
itself can also serve as a parameter). Thus, for example, 
let Def pCONST = 2<>1; then <CONSTx> in FFP 
represents the FP functional form x, since, by the meta- 
composition rule, if y ^ ±, 

(p<CONST,x>):y = (pCONST):«CONSTx>A> 

= 2°1 :«CONST,x>,y> = x. 

Here we can see that the first, controlling, operator of a 
sequence or form, CONST in this case, always has as its 
operand, after metacomposition, a pair whose first ele- 
ment is the sequence itself and whose second element is 
the original operand of the sequence, y in this case. The 
controlling operator can then rearrange and reapply the 
elements of the sequence and original operand in a great 
variety of ways. The significant point about metacom- 
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position is that it permits the definition of new functional 
forms, in effect, merely by defining new functions. It also 
permits one to write recursive functions without a defi- 
nition. 

We give one more example of a controlling function 
for a functional form: Def pCONS = aapply°tl°distr. 
This definition results in <CONS,f u ... ,/ n >— where the 
f are objects — representing the same function as 
[pfu - > p/J. The following shows this. 

(p<CONS,fu ... ,/„>):* 

= (pCONS):«CONS,f u ... ,/ n >,*> 

by metacomposition 

= aapply°tl°distr:«COA^5,/i, ... ,/ n >,*> 

by def of pCONS 

= aapply:«/i,x>, ... , </ n ,*» 

by def of tl and distr and ° 
= <apply:</i,*>, ... , apply:</ n ,x» 

by def of a 

= <(fiix) 9 ... , (f n :x)> by def of apply. 

In evaluating the last expression, the meaning function 
p will produce the meaning of each application, giving 
pfi:x as the ith element. 

Usually, in describing the function represented by a 
sequence, we shall give its overall effect rather than show 
how its controlling operator achieves that effect. Thus 
we would simply write 

(p<CONS,fi, ... ,f n >):x = <(/i:x), ... , ( f n :x)> 

instead of the more detailed account above. 

We need a controlling operator, COMP , to give us 
sequences representing the functional form composition. 
We take pCOMP to be a primitive function such that, 
for all objects *, 

(p<COMP,fi, ... ,/ n >):* 

= (MM- :(/n : *) -))) for n > 1. 
(I am indebted to Paul McJones for his observation that 
ordinary composition could be achieved by this primitive 
function rather than by using two composition rules in 
the basic semantics, as was done in an earlier paper 
[2].) 

Although FFP systems permit the definition and 
investigation of new functional forms, it is to be expected 
that most programming would use a fixed set of forms 
(whose controlling operators are primitives), as in FP, so 
that the algebraic laws for those forms could be em- 
ployed, and so that a structured programming style could 
be used based on those forms. 

In addition to its use in defining functional forms, 
metacomposition can be used to create recursive func- 
tions directly without the use of recursive definitions of 
the form Def / = E(/). For example, if pMLAST s 
null°tl°2 1°2; apply°[l, tl°2], then p<MLAST> = 
last, where last:* = * = <x u ..., * n > -► l. Thus the 
operator <MLAST> works as follows: 

H(<MLAST>:<A,B>) 
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= pipMLAST:«MLAST>,<A,B») 

by metacomposition 

= |U(apply°[l, ll°2]:«MLAST>,<A,B») 

= p(a.pply:«MLA ST>,<B») 

= p(<MLAST>:<B > ) 

= p(pMLAST.«MLAST>,<B») 

= p(\°2:«MLAST>,<B») 

= B. 

13.3.3 Summary of the properties of p and p. So far 
we have shown how p maps atoms and sequences into 
functions and how those functions map objects into 
expressions. Actually, p and all FFP functions can be 
extended so that they are defined for all expressions. 
With such extensions the properties of p and p can be 
summarized as follows: 

1) p £ [expressions — > objects]. 

2) If x is an object, px = x. 

3) If e is an expression and e = <e u ... , <?„>, then 
pe = <pe h ... , pe„>. 

4) p G [expressions — » [expressions — * expressions]]. 

5) For any expression e, pe = pipe). 

6) If x is an object and e an expression, then 
px:e = px.ipe). 

7) If x and y are objects, then p(x:y) = p(px:y). In 
words: the meaning of an FFP application (x: y) is found 
by applying px, the function represented by x, to y and 
then finding the meaning of the resulting expression 
(which is usually an object and is then its own meaning). 

13.3.4 Cells, fetching, and storing. For a number of 
reasons it is convenient to create functions which serve 
as names. In particular, we shall need this facility in 
describing the semantics of definitions in FFP systems. 
To introduce naming functions, that is, the ability to 
fetch the contents of a cell with a given name from a 
store (a sequence of cells) and to store a cell with given 
name and contents in such a sequence, we introduce 
objects called cells and two new functional forms, fetch 
and store. 

Cells 

A cell is a triple <CELL,name,contents>. We use this 
form instead of the pair <name,contents> so that cells 
can be distinguished from ordinary pairs. 

Fetch 

The functional form fetch takes an object n as its 
parameter (n is customarily an atom serving as a name); 
it is written f n (read “fetch n"). Its definition for objects 
n and x is 

t n:x = x = <t> — »• #; atom:* — > _L; 

(1:*) = <CELL,n,c> — » c: |n°tl:*, 

where # is the atom “default.” Thus jn (fetch n) applied 
to a sequence gives the contents of the first cell in the 
sequence whose name is n ; If there is no cell named n. 
the result is default, #. Thus fn is the name function for 
the name n. (We assume that pFETCH is the primitive 
function such that p<FETCH,n> = fn. Note (hat fn 
simply passes over elements in its operand that are not 
cells.) 
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Store and push, pop, purge 

Like fetch, store takes an object n as its parameter; it 
is written [n (“store /?”). When applied to a pair <xy>, 
where y is a sequence, [n removes the first cell named n 
from y, if any, then creates a new cell named n with 
contents x and appends it to y. Before defining [n (store 
n) we shall specify four auxiliary functional forms. 
(These can be used in combination with fetch n and store 
n to obtain multiple, named, LIFO stacks within a 
storage sequence.) Two of these auxiliary forms are 
specified by recursive functional equations; each takes 
an object n as its parameter. 

(cellname n) = atom — » F\ 

eq° [length, 3] eq°[[C£ZZ, n], [ 1, 2]]; F 
(push n) = pair — > apndl °[[CELL, h, 1 ], 2 ]; 1 
(pop n) = null — > <j>; 

(cellname / 2)°1 — > tl; apndl°[l, (pop rt)°tl] 
(purge n) = null — » <£>; (cellname n)° 1 — ► (purge n)° tl; 

apndl°[l, (purge n)° tl] 
[n = pair — » (push fl)°[l, (pop /i)°2]; ± 

The above functional forms work as follows. For 
x -L, (cellname n):x is Tif x is a cell named n, otherwise 
it is F . (pop n):y removes the first cell named n from a 
sequence y; (purge n):y removes all cells named n from 
y. (push n):<xy> puts a cell named n with contents 
x at the head of sequence y; [n:<x,y> is 
(push n):<x , (pop n)\y>. 

(Thus (push n):<x,y> = / pushes x onto the top of 
a “stack” named n in / ; x can be read by \ n:y' = x and 
can be removed by (pop n):y'\ thus f/?°(pop n):y' is the 
element below x in the stack /?, provided there is more 
than one cell named n in / .) 

13.3.5 Definitions in FFP systems. The semantics of 
an FFP system depends on a fixed set of definitions D 
(a sequence of cells), just as an FP system depends on its 
informally given set of definitions. Thus the semantic 
function p depends on D; altering D gives a new \ u' that 
reflects the altered definitions. We have represented D 
as an object because in AST systems (Section 14) we 
shall want to transform D by applying functions to it and 
to fetch data from it — in addition to using it as the source 
of function definitions in FFP semantics. 

If <CELL,n y c> is the first cell named n in the se- 
quence D (and n is an atom) then it has the same effect 
as the FP definition Def n = pc, that is, the meaning of 
(n:x) will be the same as that of pc:x. Thus for example, 
if <CELL,CONST,<COMP,2,l» is the first cell in D 
named CONST \ then it has the same effect as 
Def CONST = 2<>1, and the FFP system with that D 
would find 

p(CON ST:«x,y>,z>) = y 
and consequently 
\i(<CONST 9 A>:B) = A. 

In general, in an FFP system with definitions D, the 
meaning of an application of the form (atom:x) is de- 
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pendent on D; if \atom:T> ^ # (that is, atom is defined 
in D) then its meaning is p(c:x), where c = \atom\ D, the 
contents of the first cell in D named atom. If \atom\D 
= #, then atom is not defined in D and either atom is 
primitive, i.e. the system knows how to compute patomix, 
and p(atom:x) = p(patom:x\ otherwise p(atom:x) = 1. 

13.4 Formal Semantics for FFP Systems 

We assume that a set A of atoms, a set D of defini- 
tions, a set P C A of primitive atoms and the primitive 
functions they represent have all been chosen. We as- 
sume that pa is the primitive function represented by a 
if a belongs to P, and that pa = _L if a belongs to Q, the 
set of atoms in A-P that are not defined in D. Although 
p is defined for all expressions (see 13.3.3), the formal 
semantics uses its definition only on P and Q. The 
functions that p assigns to other expressions x are im- 
plicitly determined and applied in the following semantic 
rules for evaluating p(x:y). The above choices of A and 
D, and of P and the associated primitive functions de- 
termine the objects, expressions, and the semantic func- 
tion p D for an FFP system. (We regard D as fixed and 
write p for p D .) We assume D is a sequence and that \y\D 
can be computed (by the function ]y as given in Section 
13.3.4) for any atom y. With these assumptions we define 
p as the least fixed point of the functional t, where the 
function rp is defined as follows for any function p (for 
all expressions x, x x , y , y x , z, and w): 

(rp)x = x E A x; 

X = <Xi, ... , X n > — ► <P*i, ... , pX n >; 

X = (y:z) 

O' € A & (t/rD) = # — n((py)(ti:)); 

y £ A & (t/rD) = w — > p(w:z); 

y = <y u — ► p(/i:<^,z>); p(p v:z)); 1 

The above description of p expands the operator of an 
application by definitions and by metacomposition be- 
fore evaluating the operand. It is assumed that predicates 
like “x E A” in the above definition of rp are 1- 
preserving (e.g., “± E A” has the value 1) and that the 
conditional expression itself is also L-preserving. Thus 
(tp)_L = ± and (rp)(i_:z) = _l_. This concludes the seman- 
tics of FFP systems. 


14. Applicative State Transition Systems 
(AST Systems) 

14.1 Introduction 

This section sketches a class of systems mentioned 
earlier as alternatives to von Neumann systems. It must 
be emphasized again that these applicative state transi- 
tion systems are put forward not as practical program- 
ming systems in their present form, but as examples of 
a class in which applicative style programming is made 
available in a history sensitive, but non-von Neumann 
system. These systems are loosely coupled to states and 
depend on an underlying applicative system for both 
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their programming language and the description of their 
state transitions. The underlying applicative system of 
the AST system described below is an FFP system, but 
other applicative systems could also be used. 

To understand the reasons for the structure of AST 
systems, it is helpful first to review the basic structure of 
a von Neumann system, Algol, observe its limitations, 
and compare it with the structure of AST systems. After 
that review a minimal AST system is described; a small, 
top-down, self-protecting system program for file main- 
tenance and running user programs is given, with direc- 
tions for installing it in the AST system and for running 
an example user program. The system program uses 
“name functions” instead of conventional names and the 
user may do so too. The section concludes with subsec- 
tions discussing variants of AST systems, their general 
properties, and naming systems. 

14.2 The Structure of Algol Compared to That of AST 
Systems 

An Algol program is a sequence of statements, each 
representing a transformation of the Algol state, which 
is a complex repository of information about the status 
of various stacks, pointers, and variable mappings of 
identifiers onto values, etc. Each statement communi- 
cates with this constantly changing state by means of 
complicated protocols peculiar to itself and even to its 
different parts (e.g., the protocol associated with the 
variable x depends on its occurrence on the left or right 
of an assignment, in a declaration, as a parameter, etc.). 

It is as if the Algol state were a complex “store” that 
communicates with the Algol program through an enor- 
mous “cable” of many specialized wires. The complex 
communications protocols of this cable are fixed and 
include those for every statement type. The “meaning” 
of an Algol program must be given in terms of the total 
effect of a vast number of communications with the state 
via the cable and its protocols (plus a means for identi- 
fying the output and inserting the input into the state). 
By comparison with this massive cable to the Algol 
state/ store, the cable that is the von Neumann bottleneck 
of a computer is a simple, elegant concept. 

Thus Algol statements are not expressions represent- 
ing state-to-state functions that are built up by the use of 
orderly combining forms from simpler state-to-state 
functions. Instead they are complex messages with con- 
text-dependent parts that nibble away at the state. Each 
part transmits information to and from the state over the 
cable by its own protocols. There is no provision for 
applying general functions to the whole state and thereby 
making large changes in it. The possibility of large, 
powerful transformations of the state S by function 
application, S — > /: S, is in fact inconceivable in the von 
Neumann— cable and protocol— context: there could be 
no assurance that the new state f:S would match the 
cable and its fixed protocols unless / is restricted to the 
tiny changes allowed by the cable in the first place. 

We want a computing system whose semantics does 
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not depend on a host of baroque protocols for commu- 
nicating with the state, and we want to be able to make 
large transformations in the state by the application of 
general functions. AST systems provide one way of 
achieving these goals. Their semantics has two protocols 
for getting information from the state: ( 1 ) get from it the 
definition of a function to be applied, and (2) get the 
whole state itself. There is one protocol for changing the 
state: compute the new state by function application. 
Besides these communications with the state, AST se- 
mantics is applicative (i.e. FFP). It does not depend on 
state changes because the state does not change at all 
during a computation. Instead, the result of a computa- 
tion is output and a new state. The structure of an AST 
state is slightly restricted by one of its protocols: It must 
be possible to identify a definition (i.e. cell) in it. Its 
structure — it is a sequence — is far simpler than that of 
the Algol state. 

Thus the structure of AST systems avoids the com- 
plexity and restrictions of the von Neumann state (with 
its communications protocols) while achieving greater 
power and freedom in a radically different and simpler 
framework. 

14.3 Structure of an AST System 

An AST system is made up of three elements: 

1) An applicative subsystem (such as an FFP system). 

2) A state D that is the set of definitions of the 
applicative subsystem. 

3) A set of transition rules that describe how inputs 
are transformed into outputs and how the state D is 
changed. 

The programming language of an AST system is just 
that of its applicative subsystem. (From here on we shall 
assume that the latter is an FFP system.) Thus AST 
systems can use the FP programming style we have 
discussed. The applicative subsystem cannot change the 
state D and it does not change during the evaluation of 
an expression. A new state is computed along with output 
and replaces the old state when output is issued. (Recall 
that a set of definitions D is a sequence of cells; a cell 
name is the name of a defined function and its contents 
is the defining expression. Here, however, some cells 
may name data rather than functions; a data name n will 
be used in ]n (fetch n) whereas a function name will be 
used as an operator itself.) 

We give below the transition rules for the elementary 
AST system we shall use for examples of programs. 
These are perhaps the simplest of many possible transi- 
tion rules that could determine the behavior of a great 
variety of AST systems. 

14.3.1 Transition rules for an elementary AST sys- 
tem. When the system receives an input x. it forms the 
application ( SYSTEM.x ) and then proceeds to obtain its 
meaning in the FFP subsystem, using the current state 
D as the set of definitions. SYSTEM is the distinguished 
name of a function defined in D (i.e. it is the “system 
program”). Normally the result is a pair 
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fi(SYSTEM:x) = <o,d> 

where o is the system output that results from input x 
and d becomes the new state D for the system’s next 
input. Usually d will be a copy or partly changed copy 
of the old state. If i{SYSTEM\x ) is not a pair, the output 
is an error message and the state remains unchanged. 

14.3.2 Transition rules: exception conditions and 
startup. Once an input has been accepted, our system 
will not accept another (except <RESET,x>, see below) 
until an output has been issued and the new state, if any, 
installed. The system will accept the input <RESET,x> 
at any time. There are two cases: (a) If SYSTEM is 
defined in the current state D, then the system aborts its 
current computation without altering D and treats x as 
a new normal input; (b) if SYSTEM is not defined in D, 
then x is appended to D as its first element. (This ends 
the complete description of the transition rules for our 
elementary AST system.) 

If SYSTEM is defined in D it can always prevent 
any change in its own definition. If it is not defined, 
an ordinary input x will produce ^(SYSTEM :x) = _L 
and the transition rules yield an error message and 
an unchanged state; on the other hand, the input 
<RESET, < CELL, S YS TEM,s» will defme SYSTEM 
to be s. 

14.3.3 Program access to the state; the function 
p DEFS. Our FFP subsystem is required to have one new 
primitive function, defs, named DEFS such that for any 
object x ^ _L, 

defs:* = pDEFS.x = D 

where D is the current state and set of definitions of the 
AST system. This function allows programs access to the 
whole state for any purpose, including the essential one 
of computing the successor state. 

14.4 An Example of a System Program 

The above description of our elementary AST system, 
plus the FFP subsystem and the FP primitives and 
functional forms of earlier sections, specify a complete 
history-sensitive computing system. Its input and output 
behavior is limited by its simple transition rules, but 
otherwise it is a powerful system once it is equipped with 
a suitable set of definitions. As an example of its use we 
shall describe a small system program, its installation, 
and operation. 

Our example system program will handle queries and 
updates for a file it maintains, evaluate FFP expressions, 
run general user programs that do not damage the file or 
the state, and allow authorized users to change the set of 
definitions and the system program itself. All inputs it 
accepts will be of the form <key,input> where key is a 
code that determines both the input class ( system-change , 
expression , program , query , update) and also the identity 
of the user and his authority to use the system for the 
given input class. We shall not specify a format for key. 
Input is the input itself, of the class given by key. 

14.4.1 General plan of the system program. The state 
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D of our AST system will contain the definitions of all 
nonprimitive functions needed for the system program 
and for users’ programs. (Each definition is in a cell of 
the sequence D.) In addition, there will be a cell in D 
named FILE with contents file , which the system main- 
tains. We shall give FP definitions of functions and later 
show how to get them into the system in their FFP form. 
The transition rules make the input the operand of 
SYSTEM , but our plan is to use name-functions to refer 
to data, so the first thing we shall do with the input is to 
create two cells named KEY and INPUT with contents 
key and input and append these to D. This sequence of 
cells has one each for key , input , and file', it will be the 
operand of our main function called subsystem. Subsys- 
tem can then obtain key by applying \ KEY to its oper- 
and, etc. Thus the definition 

Def system = pair — > subsystem 0 /; [NONPAIR, defs] 
where 

f= 1 INPUTo[2 , , [KEY°[\, defs]] 

causes the system to output NONPAIR and leave the 
state unchanged if the input is not a pair. Otherwise, if 
it is <key,input> , then 

f:<key,input> = «CELL,INPUT,input > , 

<CELL,KEY,key > , d n > 

where D = <d u ... , d n >. (We might have constructed a 
different operand than the one above, one with just three 
cells, for key, input, and fie. We did not do so because 
real programs, unlike subsystem, would contain many 
name functions referring to data in the state, and this 
“standard” construction of the operand would suffice 
then as well.) 

14.4.2 The “subsystem” function. We now give the 
FP definition of the function subsystem, followed by 
brief explanations of its six cases and auxiliary functions. 

Def subsystem s 

is-system-change°t/f£}' — *• [report-change, apply) °[T/.V PIT. defs); 
is-expressionof KEY ^ []INPUT, defs); 
is-program°f/f£V' — ► system-check 0 apply o [|/.V£LT. dels). 
is-query° !££)''— ► [query-response°[j/yV£6T, f £/££]. defs); 
is-update°|££K — * 

[report-update, l£/££°[update, defs)) 

MT ISPLT.\FILE[. 
[report-erroro[f££M/V/ > £. /]. defs). 

This subsystem has five “/? — > /;” clauses and a final 
default function, for a total of six classes of inputs; the 
treatment of each class is given below. Recall that the 
operand of subsystem is a sequence of cells containing 
key, input, and file as well as all the defined functions of 
D, and that subsystem .operand = <output,newstate>. 

Default inputs. In this case the result is given by the 
last (default) function of the definition when key does 
not satisfy any of the preceding clauses. The output is 
report-error: <key,input>. The state is unchanged since 
it is given by defs .operand = D. (We leave to the reader’s 
imagination what the function report-error will generate 
from its operand.) 
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System-change inputs. When 

is-system-change 0 f KE Y. operand = 

is-system-change:/te/ = T, 

key specifies that the user is authorized to make a system 
change and that input = \IN PUT. operand represents a 
function / that is to be applied to D to produce the new 
state /: D. (Of course /: D can be a useless new state; no 
constraints are placed on it.) The output is a report, 
namely report-change:<;>j/?u/,D>. 

Expression inputs. When is-expression:fcey = T, the 
system understands that the output is to be the meaning 
of the FFP expression input; \INPUT.operand produces 
it and it is evaluated, as are all expressions. The state is 
unchanged. 

Program inputs and system seif-protection. When is- 
progranu&e^ = T, both the output and new state are 
given by (pinput): D = <output,newstate>. If newstate 
contains file in suitable condition and the definitions of 
system and other protected functions, then 
system-check: <output,newstate> = <output,newstate>. 
Otherwise, system-check:<output,newstate> 

= <error-report, D>. 
Although program inputs can make major, possibly dis- 
astrous changes in the state when it produces newstate, 
system-check can use any criteria to either allow it to 
become the actual new state or to keep the old. A more 
sophisticated system-check might correct only prohibited 
changes in the state. Functions of this sort are possible 
because they can always access the old state for compar- 
ison with the new state-to-be and control what state 
transition will finally be allowed. 

File query inputs. If is-query:fo>y> = T, the function 
query-response is designed to produce the output = 
answer to the query input from its operand <inputfile>. 

File update inputs. If is-update:Arey = T, input speci- 
fies a file transaction understood by the function update, 
which computes updated-ftle = update:<inputfile>. Thus 
J, FILE has <updated-file, D> as its operand and thus 
stores the updated file in the cell FILE in the new state. 
The rest of the state is unchanged. The function report- 
update generates the output from its operand 
<inputfile>. 

14.4.3 Installing the system program. We have de- 
scribed the function called system by some FP definitions 
(using auxiliary functions whose behavior is only indi- 
cated). Let us suppose that we have FP definitions for 
aU the nonprimitive functions required. Then each defi- 
nition can be converted to give the name and contents of 
a cell in D (of course this conversion itself would be done 
by a better system). The conversion is accomplished by 
changing each FP function name to its equivalent atom 
(e.g., update becomes UPDATE) and by replacing func- 
tional forms by sequences whose first member is the 
controlling function for the particular form. Thus 
\,FILE° [update, defs] is converted to 

<COM P,<STORE,FILE>, 

<CONS, UP DA TE,DEFS», 
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and the FP function is the same as that represented by 
the FFP object, provided that update = p UPDATE and 
COMP, STORE, and CONS represent the controlling 
functions for composition, store, and construction. 

All FP definitions needed for our system can be 
converted to cells as indicated above, giving a sequence 
Do. We assume that the AST system has an empty state 
to start with, hence SYSTEM is not defined. We want to 
define SYSTEM initially so that it will install its next 
input as the state; having done so we can then input D 0 
and all our definitions will be installed, including our 
program — system — itself. To accomplish this we enter 
our first input 

<RESET, <CELL, SYSTEM, loader» 

where loader = <CONS, <CONST,DONE>,ID>. 

Then, by the transition rule for RESET when SYSTEM 
is undefined in D, the cell in our input is put at the 
head of D = <j>, thus defining pSYSTEM = ploader - 
[DONE, id]. Our second input is D 0 , the set of definitions 
we wish to become the state. The regular transition rule 
causes the AST system to evaluate 
p(SYSTEM: D 0 ) = [DONE, id]:D 0 = <DONE, Do>. Thus 
the output from our second input is DONE, the new 
state is D 0 , and pSYSTEM is now our system program 
(which only accepts inputs of the form <key,input>). 

Our next task is to load the file (we are given an 
initial value fie). To load it we input a program into the 
newly installed system that contains file as a constant 
and stores it in t he state ; the input is 
<program-key, [DONE, store file]> where 

pstore-file = [ FILE°[file , id]. 

Program-key identifies [DONE, store-file] as a program 
to be applied to the state Do to give the output and new 
state Di, which is: 

pstore-file-. D 0 = j FILE°[fite, id]:Do, 

or Do with a cell containing file at its head. The output 
is DONE-.Do = DONE. We assume that system-check 
will P ass <DONE, Di> unchanged. FP expressions have 
been used i n the ab ove in place of the FFP objects they 
denote, e.g. DONE for <CONST,DONE>. 

14.4.4 Using the system. We have not said how the 
system s file, queries or updates are structured, so we 
cannot give a detailed example of file operations. How- 
ever, the structure of subsystem shows clearly how the 
system s response to queries and updates depends on the 
functions query-response, update, and report-update. 

Let us suppose that matrices m, n named M. and N 
are stored in D and that the function MM described 
earlier is defined in D. Then the input 

<expression-key, {MM°[]M, f N]°DEFS:#)> 

would give the product of the two matrices as output and 
an unchanged state. Expression-key identifies the appli- 
cation as an expression to be evaluated and since defs:# 

= D and [\M, f!V]:D = <m,n>, the value of the expres- 
sion is the result MM:</m,/i>, which is the output. 
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Our miniature system program has no provision for 
giving control to a user’s program to process many 
inputs, but it would not be difficult to give it that 
capability while still monitoring the user’s program with 
the option of taking control back. 

14.5 Variants of AST Systems 

A major extension of the AST systems suggested 
above would provide combining forms, system forms, 
for building a new AST system from simpler, component 
AST systems. That is, a system form would take AST 
systems as parameters and generate a new AST system, 
just as a functional form takes functions as parameters 
and generates new functions. These system forms would 
have properties like those of functional forms and would 
become the “operations” of a useful “algebra of systems” 
in much the same way that functional forms are the 
“operations” of the algebra of programs. However, the 
problem of finding useful system forms is much more 
difficult, since they must handle RESETS , match inputs 
and outputs, and combine history-sensitive systems 
rather than fixed functions. 

Moreover, the usefulness or need for system forms is 
less clear than that for functional forms. The latter are 
essential for building a great variety of functions from 
an initial primitive set, whereas, even without system 
forms, the facilities for building AST systems are already 
so rich that one could build virtually any system (with 
the general input and output properties allowed by the 
given AST scheme). Perhaps system forms would be 
useful for building systems with complex input and 
output arrangements. 

14.6 Remarks About AST Systems 

As I have tried to indicate above, there can be 
innumerable variations in the ingredients of an AST 
system — how it operates, how it deals with input and 
output, how and when it produces new states, and so on. 
In any case, a number of remarks apply to any reasonable 
AST system: 

a) A state transition occurs once per major computa- 
tion and can have useful mathematical properties. State 
transitions are not involved in the tiniest details of a 
computation as in conventional languages; thus the lin- 
guistic von Neumann bottleneck has been eliminated. 
No complex “cable” or protocols are needed to com- 
municate with the state. 

b) Programs are written in an applicative language 
that can accommodate a great range of changeable parts, 
parts whose power and flexibility exceed that of any von 
Neumann language so far. The word-at-a-time style is 
replaced by an applicative style; there is no division of 
programming into a world of expressions and a world of 
statements. Programs can be analyzed and optimized by 
an algebra of programs. 

c) Since the state cannot change during the compu- 
tation of system:*, there are no side effects. Thus inde- 
pendent applications can be evaluated in parallel. 
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d) By defining appropriate functions one can, I be- 
lieve, introduce major new features at any time, using 
the same framework. Such features must be built into 
the framework of a von Neumann language. I have in 
mind such features as: “stores” with a great variety of 
naming systems, types and type checking, communicat- 
ing parallel processes, nondeterminacy and Dijkstra s 
“guarded command” constructs [8], and improved meth- 
ods for structured programming. 

e) The framework of an AST system comprises the 
syntax and semantics of the underlying applicative sys- 
tem plus the system framework sketched above. By 
current standards, this is a tiny framework for a language 
and is the only fixed part of the system. 

14.7 Naming Systems in AST and von Neumann 
Models 

In an AST system, naming is accomplished by func- 
tions as indicated in Section 13.3.3. Many useful func- 
tions for altering and accessing a store can be defined 
(e.g. push, pop, purge, typed fetch, etc.). All these defi- 
nitions and their associated naming systems can be in- 
troduced without altering the AST framework. Different 
kinds of “stores” (e.g., with “typed cells”) with individual 
naming systems can be used in one program. A cell in 
one store may contain another entire store. 

The important point about AST naming systems is 
that they utilize the functional nature of names (Rey- 
nolds’ gedanken [19] also does so to some extent within 
a von Neumann framework). Thus name functions can 
be composed and combined with other functions by 
functional forms. In contrast, functions and names in 
von Neumann languages are usually disjoint concepts 
and the function-like nature of names is almost totally 
concealed and useless, because a) names cannot be ap- 
plied as functions; b) there are no general means to 
combine names with other names and functions; c) the 
objects to which name functions apply (stores) are not 
accessible as objects. 

The failure of von Neumann languages to treat 
names as functions may be one of their more important 
weaknesses. In any case, the ability to use names as 
functions and stores as objects may turn out to be a 
useful and important programming concept, one which 
should be thoroughly explored. 


15. Remarks About Computer Design 

The dominance of von Neumann languages has left 
designers with few intellectual models for practical com- 
puter designs beyond variations of the von Neumann 
computer. Data flow models [1] [7] [13] are one alterna- 
tive class of history-sensitive models. The substitution 
rules of lambda-calculus based languages present serious 
problems for the machine designer. Berkling [3] has 
developed a modified lambda calculus that has three 
kinds of applications and that makes renaming of vari- 
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ables unnecessary. He has developed a machine to eval- 
uate expressions of this language. Further experience is 
needed to show how sound a basis this language is for 
an effective programming style and how efficient his 
machine can be. 

Mago [15] has developed a novel applicative machine 
built from identical components (of two kinds). It eval- 
uates, directly, FP-like and other applicative expressions 
from the bottom up. It has no von Neumann store and 
no address register, hence no bottleneck; it is capable of 
evaluating many applications in parallel; its built-in op- 
erations resemble FP operators more than von Neumann 
computer operations. It is the farthest departure from 
the von Neumann computer that I have seen. 

There are numerous indications that the applicative 
style of programming can become more powerful than 
the von Neumann style. Therefore it is important for 
programmers to develop a new class of history-sensitive 
models of computing systems that embody such a style 
and avoid the inherent efficiency problems that seem to 
attach to lambda-calculus based systems. Only when 
these models and their applicative languages have proved 
their superiority over conventional languages will we 
have the economic basis to develop the new kind of 
computer that can best implement them. Only then, 
perhaps, will we be able to fully utilize large-scale inte- 
grated circuits in a computer design not limited by the 
von Neumann bottleneck. 


16. Summary 

The fifteen preceding sections of this paper can be 
summarized as follows. 

Section 1. Conventional programming languages 
are large, complex, and inflexible. Their limited expres- 
sive power is inadequate to justify their size and cost. 

Section 2. The models of computing systems that 
underlie programming languages fall roughly into three 
classes: (a) simple operational models (e.g., Turing ma- 
chines), (b) applicative models (e.g., the lambda calcu- 
lus), and (c) von Neumann models (e.g., conventional 
computers and programming languages). Each class of 
models has an important difficulty: The programs of 
class (a) are inscrutable; class (b) models cannot save 
information from one program to the next; class (c) 
models have unusable foundations and programs that 
are conceptually unhelpful. 

Section 3. Von Neumann computers are built 
around a bottleneck: the word-at-a-time tube connecting 
the CPU and the store. Since a program must make 
its overall change in the store by pumping vast numbers 
of words back and forth through the von Neumann 
bottleneck, we have grown up with a style of program- 
ming that concerns itself with this word-at-a-time traffic 
through the bottleneck rather than with the larger con- 
ceptual units of our problems. 

Section 4. Conventional languages are based on the 

639 


programming style of the von Neumann computer. Thus 
variables = storage cells; assignment statements = fetch- 
ing, storing, and arithmetic; control statements = jump 
and test instructions. The symbol is the linguistic 
von Neumann bottleneck. Programming in a conven- 
tional— von Neumann— language still concerns itself 
with the word-at-a-time traffic through this slightly more 
sophisticated bottleneck. Von Neumann languages also 
split programming into a world of expressions and a 
world of statements; the first of these is an orderly world, 
the second is a disorderly one, a world that structured 
programming has simplified somewhat, but without at- 
tacking the basic problems of the split itself and of the 
word-at-a-time style of conventional languages. 

Section 5. This section compares a von Neumann 
program and a functional program for inner product. It 
illustrates a number of problems of the former and 
advantages of the latter: e.g., the von Neumann program 
is repetitive and word-at-a-time, works only for two 
vectors named a and b of a given length n, and can only 
be made general by use of a procedure declaration, 
which has complex semantics. The functional program 
is nonrepetitive, deals with vectors as units, is more 
hierarchically constructed, is completely general, and 
creates “housekeeping” operations by composing high- 
level housekeeping operators. It does not name its argu- 
ments, hence it requires no procedure declaration. 

Section 6. A programming language comprises a 
framework plus some changeable parts. The framework 
of a von Neumann language requires that most features 
must be built into it; it can accommodate only limited 
changeable parts (e.g., user-defined procedures) because 
there must be detailed provisions in the “state" and its 
transition rules for all the needs of the changeable parts, 
as well as for all the features built into the framework. 
The reason the von Neumann framework is so inflexible 
is that its semantics is too closely coupled to the state: 
every detail of a computation changes the state. 

Section 7. The changeable parts of von Neumann 
languages have little expressive power; this is why most 
of the language must be built into the framework. The 
lack of expressive power results from the inability of von 
Neumann languages to effectively use combining forms 
for building programs, which in turn results from the 
split between expressions and statements. Combining 
forms are at their best in expressions, but in von Neu- 
mann languages an expression can only produce a single 
word; hence expressive power in the world of expressions 
is mostly lost. A further obstacle to the use of combining 
forms is the elaborate use of naming conventions. 

Section 8. APL is the first language not based on 
the lambda calculus that is not word-at-a-time and uses 
functional combining forms. But it still retains many of 
the problems of von Neumann languages. 

Section 9. Von Neumann languages do not have 
useful properties for reasoning about programs Axio- 
matic and denotational semantics are precise tools for 
describing and understanding conventional programs. 
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but they only talk about them and cannot alter their 
ungainly properties. Unlike von Neumann languages, 
the language of ordinary algebra is suitable both for 
stating its laws and for transforming an equation into its 
solution, all within the “language.” 

Section 10. In a history-sensitive language, a pro- 
gram can affect the behavior of a subsequent one by 
changing some store which is saved by the system. Any 
such language requires some kind of state transition 
semantics. But it does not need semantics closely coupled 
to states in which the state changes with every detail of 
the computation. “Applicative state transition” (AST) 
systems are proposed as history-sensitive alternatives to 
von Neumann systems. These have: (a) loosely coupled 
state-transition semantics in which a transition occurs 
once per major computation; (b) simple states and tran- 
sition rules; (c) an underlying applicative system with 
simple “reduction” semantics; and (d) a programming 
language and state transition rules both based on the 
underlying applicative system and its semantics. The 
next four sections describe the elements of this approach 
to non-von Neumann language and system design. 

Section 11. A class of informal functional program- 
ming (FP) systems is described which use no variables. 
Each system is built from objects, functions, functional 
forms, and definitions. Functions map objects into ob- 
jects. Functional forms combine existing functions to 
form new ones. This section lists examples of primitive 
functions and functional forms and gives sample pro- 
grams. It discusses the limitations and advantages of FP 
systems. 

Section 12. An “algebra of programs” is described 
whose variables range over the functions of an FP system 
and whose “operations” are the functional forms of the 
system. A list of some twenty-four laws of the algebra is 
followed by an example proving the equivalence of a 
nonrepetitive matrix multiplication program and a re- 
cursive one. The next subsection states the results of two 
“expansion theorems” that “solve” two classes of equa- 
tions. These solutions express the “unknown” function 
in such equations as an infinite conditional expansion 
that constitutes a case-by-case description of its behavior 
and immediately gives the necessary and sufficient con- 
ditions for termination. These results are used to derive 
a “recursion theorem” and an “iteration theorem,” which 
provide ready-made expansions for some moderately 
general and useful classes of “linear” equations. Exam- 
ples of the use of these theorems treat: (a) correctness 
proofs for recursive and iterative factorial functions, and 
(b) a proof of equivalence of two iterative programs. A 
final example deals with a “quadratic” equation and 
proves that its solution is an idempotent function. The 
next subsection gives the proofs of the two expansion 
theorems. 

The algebra associated with FP systems is compared 
with the corresponding algebras for the lambda calculus 
and other applicative systems. The comparison shows 
some advantages to be drawn from the severely restricted 
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FP systems, as compared with the much more powerful 
classical systems. Questions are suggested about algo- 
rithmic reduction of functions to infinite expansions and 
about the use of the algebra in various “lazy evaluation” 
schemes. 

Section 13. This section describes formal functional 
programming (FFP) systems that extend and make pre- 
cise the behavior of FP systems. Their semantics are 
simpler than that of classical systems and can be shown 
to be consistent by a simple fixed-point argument. 

Section 14. This section compares the structure of 
Algol with that of applicative state transition (AST) 
systems. It describes an AST system using an FFP system 
as its applicative subsystem. It describes the simple state 
and the transition rules for the system. A small self- 
protecting system program for the AST system is de- 
scribed, and how it can be installed and used for file 
maintenance and for running user programs. The section 
briefly discusses variants of AST systems and functional 
naming systems that can be defined and used within an 
AST system. 

Section 15. This section briefly discusses work on 
applicative computer designs and the need to develop 
and test more practical models of applicative systems as 
the future basis for such designs. 

Acknowledgments. In earlier work relating to this 
paper I have received much valuable help and many 
suggestions from Paul R. McJones and Barry K. Rosen. 

I have had a great deal of valuable help and feedback in 
preparing this paper. James N. Gray was exceedingly 
generous with his time and knowledge in reviewing the 
first draft. Stephen N. Zilles also gave it a careful reading. 
Both made many valuable suggestions and criticisms at 
this difficult stage. It is a pleasure to acknowledge my 
debt to them. I also had helpful discussions about the 
first draft with Ronald Fagin, Paul R. McJones, and 
James H. Morris, Jr. Fagin suggested a number of im- 
provements in the proofs of theorems. 

Since a large portion of the paper contains technical 
material, I asked two distinguished computer scientists 
to referee the third draft. David J. Gries and John C. 
Reynolds were kind enough to accept this burdensome 
task. Both gave me large, detailed sets of corrections and 
overall comments that resulted in many improvements, 
large and small, in this final version (which they have 
not had an opportunity to review). I am truly grateful 
for the generous time and care they devoted to reviewing 
this paper. 

Finally, I also sent copies of the third draft to Gyula 
A. Mago, Peter Naur, and John H. Williams. They were 
kind enough to respond with a number of extremely 
helpful comments and corrections. Geoffrey A. Frank 
and Dave Toile at the University of North Carolina 
reviewed Mago’s copy and pointed out an important 
error in the definition of the semantic function of FFP 
systems. My grateful thanks go to all these kind people 
for their help. 

Communications August 1978 

of Volume 21 

the ACM Number 8 



References 

1 Arvind, and Gostelow, K.P. A new interpreter for data flow 
schemas and its implications for computer architecture. Tech. Rep. 
No. 72, Dept. Comptr. Sci., U. of California, Irvine, Oct. 1975. 

2 Backus, J. Programming language semantics and closed 
applicative languages. Conf. Record ACM Symp. on Principles of 
Programming Languages, Boston, Oct. 1973, 71-86. 

3 Berkling, K.J. Reduction languages for reduction machines. 
Intemer Bericht ISF-76-8, Gesellschaft fur Mathematik und 
Datenverarbeitung MBH, Bonn, Sept. 1976. 

4. Burge, W.H. Recursive Programming Techniques. Addison- 
Wesley, Reading, Mass., 1975. 

5. Church, A. The Calculi of Lambda-Conversion. Princeton U. 

Press, Princeton, N.J., 1941. 

6. Curry, H.B., and Feys, R. Combinatory Logic , Vol. 1. North- 
Holland Pub. Co., Amsterdam, 1958. 

7. Dennis, J.B. First version of a data flow procedure language. 
Tech. Mem. No. 61, Lab. for Comptr. Sci., M.I.T., Cambridge, Mass., 
May 1973. 

8. Dijkstra, E.W. A Discipline of Programming. Prentice-Hall, 
Englewood Cliffs, N.J., 1976. 

9. Friedman, D.P., and Wise, D.S. CONS should not evaluate its 
arguments. In Automata , Languages and Programming , S. Michaelson 
and R. Milner, Eds., Edinburgh U. Press, Edinburgh, 1976, pp. 
257-284. 

10. Henderson, P., and Morris, J.H. Jr. A lazy evaluator. Conf. 
Record Third ACM Symp. on Principles of Programming Languages, 
Atlanta, Ga., Jan. 1976, pp. 95-103. 

11. Hoare, C.A.R. An axiomatic basis for computer programming. 
Comm. ACM 12 , 10 (Oct. 1969), 576-583. 


12 . Iverson, K. A Programming Language. Wiley, New York, 1962. 

13. Kosinski, P. A data flow programming language. Rep. RC 4264, 
IBM T.J. Watson Research Ctr., Yorktown Heights, N.Y., March 
1973. 

14 . Landin, P.J. The mechanical evaluation of expressions. Computer 
J. 6, 4 (1964), 308-320. 

15 . Mago, G.A. A network of microprocessors to execute reduction 
languages. To appear in Int. J. Comptr. and Inform. Sci. 

16 . Manna, Z., Ness, S., and Vuillemin, J. Inductive methods for 
proving properties of programs. Comm. ACM 16 , 8 (Aug. 1973) 
491-502. 

17 . McCarthy, J. Recursive functions of symbolic expressions and 
their computation by machine, Pt. 1. Comm. ACM J, 4 (April 1960), 
184-195. 

18 . McJones, P. A Church-Rosser property of closed applicative 
languages. Rep. RJ 1589, IBM Res. Lab., San Jose, Calif., May 1975. 

19 . Reynolds, J.C. gedanken — a simple typeless language based on 
the principle of completeness and the reference concept. Comm. 

ACM 13, 5 (May 1970), 308-318. 

20 . Reynolds, J..C. Notes on a lattice-theoretic approach to the theory 
of computation. Dept. Syst. and Inform. Sci., Syracuse U., Syracuse, 
N.Y., 1972. 

21. Scott, D. Outline of a mathematical theory of computation. Proc. 
4th Princeton Conf. on Inform. Sci. and Syst., 1970. 

22 . Scott, D. Lattice-theoretic models for various type- free calculi. 
Proc. Fourth Int. Congress for Logic, Methodology, and the 
Philosophy of Science, Bucharest, 1972. 

23. Scott, D., and Strachey, C. Towards a mathematical semantics 
for computer languages. Proc. Symp. on Comptrs. and Automata, 
Polytechnic Inst, of Brooklyn, 1971. 


641 


Communications 

of 

the ACM 


August 1978 
Volume 21 
Number 8 














































